Skip to content
Advertisement

sorting takes 2 hours on vagrant – approx 100m lines

What can I do to optimize this sort?

I am running:

JavaScript

and then:

JavaScript

getting the following output:

JavaScript

here’s the datatset that I am using:

enter image description here

a preview of the original dataset:

enter image description here

here are the details on the vagrant machine:

enter image description here

What can I do to optimize this sort?

Advertisement

Answer

Split your data into several files, sort each file in parallel, then merge the files together. see here for example.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement