Vous trouverez une version française de cet article ici
The AttributeManager is a transformer present in the toolbox of any good FMEist since 2016.
It allows :
- rename attributes
- delete attributes
- create attributes
- defines values of an attribute whether by arithmetic operation, conditional values or concatenation of fields
Before 2016, each of these functions was possible in FME through special transformers ; AttributeCreator allows, for instance, create new attributes and AreaCalculator calculate polygons area.
We might therefore think, that 5 years after its release, users have become accustomed to its operation (which is true largely in view of its 2nd place in transformers ranking), and Safe should decide to abandon his predecessors.
But this is not the case, they continue to live happily together.
Why a speed test ?
So I’m making the assumption that based on the use cases, performance must not be the same. This could explain why transformers, having similar functions, still exist in parallel.
Let’s take a closer look at this, taking, for the test, 4 transformers that the AttributeManager could replace functionally speaking :
- AttributeRenamer, to rename attributes
- AttributeCreator, to create new attributes
- AttributeKeeper, which just keep some attributes
- AreaCalculator, allows to calculate polygons area
To carry out this test, I get Charente-Maritime PCI parcels (Plan Cadastral Informatisé), nearly 1Go of data, with 1 671 935 lines.
We will call the AttributeManager transformer *AM* in the rest of this article to avoid overloading the reading.
You will find the workbench that allowed me to perform the performance tests here.
Test of the AM face to face with its ancestors
To try to save some time, Feature Caching is enabled for all of the following tests only after reading the input data.
Each time a process test is indicated, I ran the same transformer (or chain of transformers) 5 times and averaged the 5 times.
Regarding the AM facing its anchors, one by one, and for the same tasks, you will find the summary of times below :
For the moment, except for the creation of attributes where a doubt may remain (the gap is rather small), the “old” specific transformers appear as more efficient than the AM.
It’s like having a Leatherman and a Phillips screwdriver on hand, and you only have Phillips screws to put on. You’ll probably go faster using the screwdriver.
AM test to replace a chain of predecessors
Let’s now perform sequences of the previous 4 transformers compared to the same tasks performed in the AM.
First of all, it should be noted that a single AM may, at times, not perform the same functions as single-spot transformers.
For example, if you want to create a new Attribute 2 from an already existing Attribute 1, and delete Attribute 1 in the stride.
Just chain an AttributeCreator and then an AttributeRemover (or AttributeKeeper for those who like to check a lot of boxes).
Here, one AM is not enough.
Indeed, if you say in the AM parameters that you want to both create a field from an attribute and delete it, it may not understand what you want it to do…
2 transformers chain :
3 transformers chain :
4 transformers chain :
We realize here, that when at least 2 historical transformers are chained, against a single AM, the time saving is there.
And it can sometimes even be 2x times faster!
I guess that the AM is optimized, when there are several tasks to perform, so that they are carried out as quickly as possible.
To use the Leatherman metaphor, if you have to screw in cruciforms, straighten a rod and open a beer (it can!), this time you’ll save time by taking the Leatherman, rather than picking up each tool one by one from your toolbox.
AM Chain test
A last test to push a little…
What happens if we find ourselves in the case of figure mentioned above of the creation of an Attribute 1 and an Attribute 2, then of an Attribute 3, itself dependent on the first 2, but that we wish to exchange with the deletion of the first 2 created attributes?
Clearly, an AttributeCreator + an AttributeKeeper do the trick.
But if we want to go through AM, we will have to chain 2:
- AttributeCreator (creation of Attribute 1, Attribute 2 and Attribute 3 = Attribute 1 + Attribute 2) + AttributeKeeper (we just keep Attribute 3) : 139 seconds
- AM which create the 3 attributes + AM which keeps only Attribute 3 : 148 seconds
Logically enough, the chain of 2 AM is longer than that of 2 historical transformers.
If you want to use 2 different Leathermans, each for different tasks, you might as well take the specific tools, and you will save time!
Conclusion
After these few quick tests, I think there is some truth in my initial hypothesis.
In fact, it is in our best interest to use a transform history AttributeWhat rather than the AM, only if a single operation is performed.
Therefore, as several operations are chained, we must not deprive ourselves of using theAttributeManager directly in order to save processing time!!
Leatherman Metaphore. I rest my case…
Even if it is always good tone to be wary of the order in which the operations are carried out.
As mentioned earlier, avoid, for example, in the same AM deleting an attribute if you want to perform a field calculation or concatenation on it .
Feel free to comment directly below, or send me a message on Twitter, I will answer you with pleasure !
Laisser un commentaire