Skip to content
Advertisement

Remove newline before a match – Linux

I want to remove the newline before the </script> in my HTML file with a Linux command (sed, awk…).

Sample input:

JavaScript

Sample output:

JavaScript

I tried different syntax, but none of them could do.

Advertisement

Answer

First of all, as mentioned in the comments Don’t parse XML with Regex! Never do it, never think about it. Make it a habit not to think about it! Sometimes it might look to be a simple task that can be performed with or or any other regex parser, but no …

What you can do, on the other hand—if you really want to use or — processes the file first with and convert it into a PYX format.

The PYX format is a line-oriented representation of XML documents that is derived from the SGML ESIS format. (see ESIS – ISO 8879 Element Structure Information Set spec, ISO/IEC JTC1/SC18/WG8 N931 (ESIS))

So what you realy want to do is something like :

JavaScript

In your case this would be something like:

JavaScript

This will output

JavaScript
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement