Skip to content
Advertisement

Merge XML files and delete duplicate rows

I’m using the following script to merge XML files. There are 5 different XML file types, ItemAvailability.xml, ItemUpsert.xml, ItemCatDesc.xml, ItemPrice.xml and ItemDelete.xml. How can I have the same script remove duplicate rows from the combined XML files?

I really only need the duplicates removed from the ItemAvailability.xml files, the script is run on a 15-minute interval cron job. Any ideas would be helpful.

Using @Sahil M edited script works when I run only the ItemAvailability.xml, but when I run all the XML files together I get the following error:

JavaScript

@Sahil M updated script

JavaScript

Original Script

JavaScript

To combine files, run:

JavaScript

Script source: Merge multiple XML files from command line

Sample XML files.

ItemAvailability.xml

JavaScript

ItemUpsert.xml

JavaScript

ItemCatDesc.xml

JavaScript

ItemPrice.xml

JavaScript

ItemDelete.xml

JavaScript

Sorry for the confusion here, I’m not trying to merge all the different file types into 1. Here is the way I have everything set up.

In 15 minute intervals, I auto download a batch of files, there is usually 3 or 4 files of each type So I end up with something like this:

ItemAvailability-111.xml, ItemAvailability-222.xml, ItemAvailability-333.xml ItemUpsert-111.xml, ItemUpsert-222.xml, ItemUpsert-333.xml ItemCatDesc-111.xml, ItemCatDesc-222.xml, ItemCatDesc-333.xml ItemPrice-111.xml, ItemPrice-222.xml, ItemPrice-333.xml ItemDelete-111.xml, ItemDelete-222.xml

Then my main script sorts these files and runs the merge script on each file type, so when it’s finished I have 5 files (ItemAvailability-combined.xml, ItemUpsert-combined.xml, ItemCatDesc-combined.xml, ItemPrice-combined.xml and ItemDelete-combined.xml) to import into my store.

Advertisement

Answer

The following removes the duplicate rows from the combined files. The duplicate rows are extracted using sub-elements of each row.

JavaScript

Merging the file with itself to demonstrate duplicate entries from other files also:

JavaScript

And the result:

JavaScript

Based on the update in the question, just change the set comparison to single element comparison. So your for loop should be:

JavaScript

And when you can run

JavaScript

And mergedFile.xml is :

JavaScript

You can have arbitrary number of files with complete tags.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement