A blog about software development, primarily in Java and about web applications.

Tuesday, September 22, 2009

Vim Tip - replace comma-separated list of values in XML with individually tagged items

A twitter posting at http://stackoverflow.com/questions/1457537/help-with-grep-in-bbedit/ (formerly http://spreadsheets.google.com/viewform?formkey=dFR1ajFXOEVvd0JsRWNTWktVQzNfNUE6MA..) asked how to convert an XML tag that contained a comma-separated list of values into individually tagged items, each on their own line. They asked how to do this in BBEdit, which I don't use. This is how to do it in Vim:

:%s/<dc:subject>\(\_[^<]\+\)<\/dc:subject>/\=substitute(submatch(0), ",[ \t]\*", "<\/dc:subject>\r<dc:subject>", "g")/g

The above Vim search-replace command will handle tags that span multiple lines and multiple tagged lists on the same line. I use [ \t]\* in the substitute() call because the atom \s (any whitespace) did not work. I had to explicitly list a space and a tab as shown.

\_[^<]\+ matches the entire contents between the original opening and closing tag.

The \=substitute(submatch(0)...) replace string executes a new search and replace command on the matched text. It's this that replaces the comma, optionally followed by a space, with a new closing tag and a newline (\r).

No comments: