LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Blogs > sashi_hc
User Name
Password

Notices


Rate this Entry

XML - beautifying

Posted 04-02-2014 at 07:19 AM by sashi_hc
Updated 04-02-2014 at 10:05 AM by sashi_hc

You want to make your XML more readable
You have just extracted an XML from a source and it is all un-indented and so not very readable. That happens in my workplace where a single XML can be many MBs in size.

Using xmllint
xmllint --format <your_existing_xml_file_name> > new_xml_file_name

If you do not have xmllint installed, you have the perl method. This is slower, but works the same. This is not my code but see the author info. Works good.

#!/usr/bin/perl
#
# Purpose: Read an XML file and indent it for ease of reading
# Author: RedGrittyBrick 2011.
# Licence: Creative Commons Attribution-ShareAlike 3.0 Unported License
#
use strict;
use warnings;

my $filename = $ARGV[0];
die "Usage: $0 filename\n" unless $filename;

open my $fh , '<', $filename
or die "Can't read '$filename' because $!\n";
my $xml = '';
while (<$fh>) { $xml .= $_; }
close $fh;

$xml =~ s|>[\n\s]+<|><|gs; # remove superfluous whitespace
$xml =~ s|><|>\n<|gs; # split line at consecutive tags

my $indent = 0;
for my $line (split /\n/, $xml) {

if ($line =~ m|^</|) { $indent--; }

print ' 'x$indent, $line, "\n";

if ($line =~ m|^<[^/\?]|) { $indent++; } # indent after <foo
if ($line =~ m|^<[^/][^>]*>[^<]*</|) { $indent--; } # but not <foo>..</foo>
if ($line =~ m|^<[^/][^>]*/>|) { $indent--; } # and not <foo/>

}
Posted in Uncategorized
Views 992 Comments 0
« Prev     Main     Next »
Total Comments 0

Comments

 

  



All times are GMT -5. The time now is 08:33 PM.

Main Menu
Advertisement
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration