Coding Guidelines in webMethods IS – Part 4

Hello, fellow guidelines enthusiasts. I am back with yet another part (the 4th) of the Coding Guidelines series.

The end of the series is near. Feeling blue? No need, we will continue our endeavors in the future on other topics.

The goddess of creativity has been merciful and allowed me to write 4 out of 6 posts I have planned in the series.

Now let’s dive right in because I guess you are not here (only) for my witty humor, but to find out more about today’s topic: performance guidelines.

 

Performance considerations

 

 

When you are doing integration services you are acting like a bridge between 2 or multiple systems.
This means you have to leave the minimal footprint possible.

 

The time required to execute the integration code is just one piece of the puzzle.
Add to this the other parts (time needed to execute the code of the systems you are connecting, the network lag, etc.) and you will get a grasp on what the end user feels.

 

Performance is always important (especially in this day and age), but it is crucial for integrations, therefore let’s see how to transform our code in a high performant one.

 

 

Stateless services

 

Big thumbs up for stateless services. Use them whenever possible. Use them as the rule, not the exception.
I guess you will rarely need stateful IS services.

 

Stateless services do not maintain session state and therefore leave a more light footprint on the IS (memory, database, cluster sync).

 

Newly created services are stateless by default (until webMethods 8.2 inclusive the default was stateful).

 

 

Service Caching

 

Do you remember the iconic pop song “..Baby one more time”?

Well, caching is like that. Well not exactly like that, but similar.

Caching starts with identifying operations that retrieve data with the following properties:

    • it rarely changes
    • it needs to be retrieved often
    • it’s not performant to retrieve it every time from storage (DB, filesystem, etc.)

Next step is to save the data in an in-memory location (let’s call it.. a cache).

 

Subsequent requests for that data will retrieve it from the cache, not the permanent storage.

So the data is stored one time in the cache and then the cache is hit one more time, and then again, and again (see what I did there?).

Obviously, the caching concept was not invented by SoftwareAG. Their work is in regards to Service Caching.

 

Service Caching details

 

Service Caching is a performance optimization feature of stateless IS services.

If this feature is enabled, the IS will store the complete pipeline (Input + Output) in a local cache for the time configured in the Cache Expire field.

If the service is called again with the same input, the values from the cache are returned to the caller and the service is not executed.

 

The properties that IS lets you modify for the caching purposes are:

    • Cache results: True to cache the service execution results; False otherwise
    • Cache expire: The number of minutes the pipeline contents remain in the memory after they are cached
    • Prefetch: True to have the server automatically refresh a cached result when it expires by re-executing the service with the same inputs
    • Prefetch activation:  specifies the minimum number of times a cached result must be hit for the server to prefetch the results. If the server retrieves the cached results fewer times than specified in the Prefetch activation property, the server will not prefetch the service results next time the cache expires

 

Clearing the cache

 

Let’s say that you have cached the services that needed to be cached.
You see a performance improvement because that is what caching does, but, but, the data that rarely changes, now it has changed.

 

At this point, the data in the cache is not in sync with the data in the storage and your integration will most probably not work correctly.

 

You have several options to resolve the issue:

    • clear the service cache (this can be done from the IS Admin console or from the service Properties in Designer)
    • clear the server cache (this can be done from the Service Usage page of the IS Admin console)
    • wait for the cache to expire
    • restart the server

As you see some options are more drastic than others so pick the one that fits your situation.

 

Caching guidelines

 

Before we go on to other topics, I am going to lay out some guidelines on using cached services:

  • To know at first glance which services are cached, put them in a special folder called cached
  • Be aware of the size of the data you cache and remember that IS saves both input and output; if you are trying to cache large data you might get into memory problems
  • Use cached services in transformers
  • If you cannot use cached services in transformers then the cached service must start with a call to the pub.flow:clearPipeline service. This is done to make sure that residual pipeline data coming from the caller service is not cached (PS: check the comments section there are good discussions there on this topic)

 

 

Startup and Shutdown services

 

All the initialization steps should be done on package load. These include, but are not limited to, loading of static data, registering handlers, creating or initializing artifacts, etc.

For this cases, use the Startup services functionality.

 

This way you make sure that all the preparation is done when the package is loaded (manual or programmatic reload and deployment) and not when the actual, time-critical functionality is called.

Mirroring this, if there is the need for cleanup at shutdown time use the Shutdown services functionality.

 

 

Save/Restore the pipeline

 

There are 2 ways to save and restore the pipeline of a service.

Option 1 refers to the Pipeline debug property available in the service Properties section.

Having this set to something else than None will result in unnecessary IO operations and reduce the service performance as these services need to read or write to the disk.

 

Option 2 refers to using the dedicated WmPublic services that save or restore the pipeline. See their list below:

  • flow:restorePipeline
  • flow:restorePipelineFromFile
  • flow:savePipeline
  • flow:savePipelineToFile
  • pub.flow:tracePipeline

The usage of these services should be prohibited in non-development environments.

 

 

Service Audit

 

The service audit topic is a tricky one in relation to the performance and I will explain this later on.

First, let’s see what Service Audit is.

 

Service Audit definition

 

This feature (that can be configured in the Properties panel of the service) is used to save certain data (start and finish service timestamps, pipeline data) into a webMethods storage for later use. The storage can be either the file system or a database.

 

Below you can find the Audit properties and their possible values:

PropertyValue
Enable auditingNever
When top-level service only
Always
Log onError only
Error and success
Error, success, and start
Include pipelineNever
On errors only
Always

What the values mean is self-explanatory. But for a more in-depth read, you can check the Service_Development_Help document.

The one that might need extra explaining is When top-level service only. 

When the property Enable auditing is set to this value the audit data is generated only when the service is invoked from a client request or from a trigger.
Audit data will not be generated when the service is invoked from another service.

Now that we know what the Service Audit is, let’s see for what it can be used.

 

Service Audit use cases

 

Service Audit will have different use cases based on the combinations of the property values above.

  • Error auditing: used to track and re-invoke failed services
    • Enable auditing = When top-level service only | Always
    • Log on = Error only
    • Include pipeline = On errors only
  • Service auditing: used to see the number of successful and failed invocations of a service
    • Enable auditing = When top-level service only
    • Log on = Error and success
    • Include pipeline = On errors only
  • Auditing for Recovery: used to track executed services and resubmit service invocations
    • Enable auditing = When top-level service only
    • Log on = Error and success
    • Include pipeline = Always
  • Auditing Long-Running Services: used to track the execution time of services
    • Enable auditing = When top-level service only
    • Log on = Error, success, and start
    • Include pipeline = Never

 

 

Service Audit conclusions and takeaways

 

Preliminary conclusion:  Service Audit is your trusted sidekick when it comes to investigating problems and bottlenecks, thus increasing performance.

However, as the saying goes: There is no such thing as free lunch.

 

Service Audit comes with some performance limitations of its own. Why is that? Because IS now has to save the audit data to storage besides executing the actual service.

The performance issue becomes more critical if the pipeline is included and if this pipeline is a large one.

In the Service_Development_Help document, you can check what is the performance impact for each value of the audit properties.

 

One way to reduce the footprint that audit adds to your service execution is to configure the audit logging as asynchronous.
This can be done from the IS Admin Console (Settings -> Logging -> Service Logger).

If set to async the audit logging will not write directly to the end storage (database or file) but to a queue and from there to the storage.

 

Service Audit recommendations

 

Based on what I said so far you can decide if you need or not audit logging (and what verbosity you require).

My recommendations are below:

    • If you are using a database as audit storage make sure that it is archived on a regular basis
    • Make the audit logging async
    • Include the pipeline only if necessary (i.e if you need to resubmit the service)

 

I will end this section with Question everything! Challenge everything!

If you add audit logging to your services, make sure to measure the impact that this has on your service performance (via a load test).

Tweak you configurations as much as you need until you reach the desired ratio between audit logging verbosity and performance.

 

 

Keep the pipeline clean

 

I have said this before, I know, but I will repeat it whenever I have the chance: Pipeline variables should be dropped when they are no longer needed.

 

A clean pipeline results in a faster IData cursor access times and less memory consumption, hence resulting in a better performance.

A messy pipeline is sloppy and error-prone and can lead to introducing defects in your implementation. Let me make my case with an example.

 

There is a Flow language feature (I think it is called implicit mapping) that links pipeline input variables to service inputs by default if they have the same name.

This feature has a good reason for existing: it speeds up development by removing the need to manually link these fields.

 

The problem comes in messy pipelines where un-dropped variables (that should have been dropped) will be implicitly mapped to service inputs they have nothing to do with.

In the example above the inputValue variable should have been dropped at the first Map step.

Because it was not, it was implicitly mapped to the input of the setVariable service.

However, the variable correctInputForsetVariableService should have been the correct choice.

Imagine this happening in a much larger service. Rather easy to overlook => a nice little bug has appeared.

 

 

Clear pipeline

 

As you might know, there is a service (pub.flow:clearPipeline) that clears the pipeline if you do not want to do it manually.

I do not like to use this service as I think it promotes laziness.
You are less likely to keep a clean pipeline if you know that at the end there is a service that will clean it up for you.

 

Also, if you keep a clean pipeline this service becomes redundant.

Additionally, the pub.flow:clearPipeline leads to unnecessary overhead because the pipeline is walked to determine if a variable should be dropped or preserved.

 

In the case of large pipelines, the execution of this service and can have a significant negative impact on performance.

Exception: In cached services, a call to pub.flow:clearPipeline should be put at the very start of the service (if you plan to call the service by direct invocation and not as a transformer).

LATER EDIT: Try to not use this service at all. As you can see also in the comments there are far better options for the caching scenario.

 

 

Disabled code

 

Here is another Flow feature that helps at development time, but impacts performance if you are not paying attention.

Rule: Delete rather than disable unused code.

Benefits of following the rule:

    • increased performance because all the flow code is interpreted during the execution (both enabled and disabled code)
    • more readable and maintainable flow code and lower cost of maintenance

 

 

This is my dear readers the end of the 4th chapter of the Coding Guidelines story.
Next time we will finish up the performance guidelines and we outline basic security rules.

 

In the meantime, you can check this resource I recently found.

It is written in a very succinct style and I cannot say I agree with all the guidelines there, but certainly, it has a fair share of valuable information.

 

Until next time, as always…

 

Happy exploring,
Tury

16 thoughts on “Coding Guidelines in webMethods IS – Part 4

  • Mohammed Hakkim

    Hi vlan,

    I went-through this part. It was really awesome.

    Just to know, how to find Disabled code/Disabled Steps in flow service?

    How to find & remove disabled steps, if any?

    • Hi Mohammed,

      Thank you for the feedback.

      There are multiple ways to find the Disabled Steps.

      1. Tool-less

      If you have Disabled Steps in your flow service, then your flow.xml file will have this String in it: DISABLED=”true”
      If you do a file search with this string you will find your service.

      2. Using special tools

      In part 6 I have mentioned the Integration Server Continuos Code Review tools.
      This tool tells you, among others, if you have disabled steps

      I think option 1 is the most convenient one as well as easy to use.

  • reamon

    I too use scope to limit things. (Transformer is just a variation of scope, or vice versa). I had not seen the info that more recent versions automatically scope calls to the declared inputs for caching — thanks for the info! I’ll look for the details on that.

    I also avoid clearPipeline like the plague. 🙂

    I also never cache anything. For me it has never made a meaningful performance difference and only adds complexity and confusion. (You forgot to clear the cache.) People *think* reading from the DB or file is slow, but IME it almost never is the bottleneck. Measurements are key.

    In your post: “Performance is always important (especially in this day and age), but it is crucial for integrations”

    It depends entirely on the specifics of the integration. For most “unattended” behind-the-scenes integrations, performance isn’t much of a factor at all. It doesn’t matter if it takes 2 seconds or 2 hours — just as long as it gets there.

    Of course where wM IS is used to support request/response interactions and user is waiting for the response, performance is important. As noted above, measurements are key to find where the bottlenecks are. One may find that time is spent elsewhere and no amount of premature optimization of the IS code will make any difference.

  • RMG

    I also totally agree with the scope functionality rather than use of clearPipeline (explicitly call) and its very popular usage in some of the scenarios especially when dealing the TN routing scenarios.

    Big Cheers!

  • areader

    I don’t quite agree with the recommendation to not to use clearPipeline. IMO this is the only way to guarantee that your service only returns what it claims to return in its output signature. Even if you drop all the unnecessary varaibles, chances are that some services that are called in your service leave behind some data that are not declared in their output. This data will not be visible in the flow editor and hence will not be dropped. The only cure against that is clearPipeline.

    • I agree with you that you cannot drop what you cannot see.

      The clear pipeline guideline goes hand in hand with a guideline I mentioned in Part 2 (https://wm-explorer.com/coding-guidelines-in-webmethods-is-second-part/). This guideline goes like: “A service should not return more variables than its contract states.”

      If we follow this guideline then every service will not return more data than its signature specifies and therefore there will be no need for a clearPipeline call.

      Now, I do realize that what I am saying is ideal and that we might have to call in our services other services that were not written by us and that do not comply with the “Respect the service contract” guideline.

      In such cases and in similar ones we have no other thing to do but to use the clearPipeline.

      However what I would not like is see people creating their services and using clearPipeline as a given, as a default.
      My approach would be to create my service without clearPipeline, then carefully test and review it. If a call to clearPipeline is needed and have no way around it whatsoever I make the call. But not otherwise.

      The reasons I do not like to use it are presented in the post.

      I have also observed some discrepancies on SAG side regarding this.
      On the one hand side, there are several KB articles where the solution is to use clearPipeline.
      On the other hand side, the Code Review tool build by SAG (small spoiler alert: that I will mention and present in the last post of the series) reports the presence of clearPipeline in the reviewed services and marks them as non-compliant.

  • Gerardo Lisboa

    Hi,

    Why do we have the service audit settings in _design time_?
    They should be only defined in runtime.

    Now, if you need to changed them, you are actually changing the package, using developer tools, so it’s no longer the same thing you deployed….

    They should only exist on the IS’s configuration runtime, so the operations team might decide to put the service in audit and remove it at any time.

    I see no reason for the present situation (a Brainstorm request was opened).

    Best regards,

    • I totally agree with you on this topic. Updating the audit settings at this moment requires changing the package, new build, new deployment…
      Having the possibility to change them from the IS Admin console would remove this pain-point and give the operation team a lot more freedom.

      If possible do share the Brainstorm request link so that we can like and watch the request.

      Thank you.

    • reamon

      +1

      This has long been a “why did they do that?” Similar to the old TN Console where design and run-time concerns were all mooshed together, making life a bit more challenging. 🙂

  • areader

    In general, you have written a very good series so far. If this were a common knowledge in the wM world, wM software would be at a much higher level of quality.

  • areader

    > If you cannot use cached services in transformers then the cached service must start with a call to
    > the pub.flow:clearPipeline service. This is done to make sure that only the needed inputs are
    > passed to the service

    But isn’t it too late then? Since the cached Service has been called with a wrong pipeline. Clearing it *inside* the service does not change the fact that the input pipeline was not an intended one.

    • Thank you areader for the comment. You have identified a portion of the post that needs further explanation.

      The major usage of the clearPipeline service in the caching context is to not allow the caching of what I like to call residual pipeline data.
      If the cached service is called first from Service A, it will cache also the pipeline data sent from Service A.
      The subsequent call to the cached service from let’s say Service B will retrieve also Service A’s pipeline which is kind of messy.

      As you correctly said in the comment I was not suggesting the fact that the clearPipeline call is useful for the cases the cache is hit. It is, of course, not.

      Moreover, the recent webMethods versions, will automatically “scope” the input received down to the service input signature, in order to decide if they should hit the cache or execute the service. The usage of the clearPipeline service is not needed for this use case anymore.

      Thank for pointing this miss-explanation to me.

      Will update the post so that it is clear.

      I think the takeaway of this is: do not use clearPipeline service. As we discussed here in the comments there are other more elegant and performant options.

  • Gerardo Lisboa

    Hi,

    When I need to call a service which _I know_ it is cached, and the transformer approach is not feasible, I do it inside a SEQUENCE with a “scope variable”.

    This way, I only put inside the scope variable the inputs I need, and take out of it the outputs I need, without the need for a call to `pub.flow:clearPipeline`.

    When I’m building a cached service which is going to be reused (by me or someone else, where I don’t control the development practices), I create a wrapper service, which is the one being exposed, which uses that scoped approach so the developer does not have to do it.

    In that situation, I can also expose a control service so that the cache reset can be made correctly.

    Best regards,

    • Very good points Gerardo, thank you for the comment.

      I certainly like your usage of the Scope functionality and the client orient approach.

      You are right: because we cannot control the way the service is invoked, it is a good idea to structure our code in a flexible way.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: