LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 06-12-2012, 02:26 PM   #1
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
Database dump in easy to parse/grep format?


I'm looking for a way to dump an entire database to a stream that can be easily parsed or grepped. The database engine (for now) is MySQL. And I do have full dump backup files, so something that can convert one of those (in massive SQL format) to a suitable output stream should work, too.

Basically the output should be one text line per data cell, with the database name, table name, column name, and data value. Alternatively I could probably work with a JSON format output. XML is less desired, but might still be usable.

I have full rights to access the database, but the MYSQL tools don't give me a program-friendly output of everything.
 
Old 06-12-2012, 02:37 PM   #2
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
i think a custom script based on mysqldump mite generate the desired output.

Last edited by schneidz; 06-12-2012 at 02:39 PM.
 
Old 06-12-2012, 04:23 PM   #3
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
Originally Posted by schneidz View Post
i think a custom script based on mysqldump mite generate the desired output.
I want one of those output generating mites, too! ;}


But yeah, you're right. And mysqldump w/
Code:
       ·   --fields-terminated-by=..., --fields-enclosed-by=..., --fields-optionally-enclosed-by=..., --fields-escaped-by=...
can make it as "machine-friendly" as the OP desires.



Cheers,
Tink
 
Old 06-13-2012, 01:05 PM   #4
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by Tinkster View Post
I want one of those output generating mites, too! ;}


But yeah, you're right. And mysqldump w/
Code:
       ·   --fields-terminated-by=..., --fields-enclosed-by=..., --fields-optionally-enclosed-by=..., --fields-escaped-by=...
can make it as "machine-friendly" as the OP desires.



Cheers,
Tink
These seem to require the --tab=path (or -T path) option, and that option will not write to stdout as a stream. Instead it writes a bunch of files named per table. This also requires running on the same machine as the database engine. It's also unclear how the files will be named for multiple databases, or maybe it can only do one database.

The format I want is:

databasename<delimiter>tablename<delimiter>columnname<delimiter>value<newline>

with the value encoded in some way so delimiter, newline, and binary bytes are encoded as some notation like backslash-octal or percent-hexadecimal or such. Documentation seems to suggest this is not possible.

Any other ideas?
 
Old 06-13-2012, 01:13 PM   #5
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by schneidz View Post
i think a custom script based on mysqldump mite generate the desired output.
Yeah. But how to get mysqldump to work in some suitable way appears to be the standing issue.

I think it is unlikely mysqldump is usable for this. I suspect this will require some independent program that can access MySQL. I also think it would be better if it can do that access at a layer below SQL processing ... e.g. a raw layer, since it's going to get everything.
 
Old 06-13-2012, 01:39 PM   #6
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
It looks like the --xml option is the only thing viable with mysqldump. But now I need to find something to convert the XML over to something usable.
 
Old 06-13-2012, 01:59 PM   #7
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Quote:
Originally Posted by Skaperen View Post
It looks like the --xml option is the only thing viable with mysqldump. But now I need to find something to convert the XML over to something usable.
xmlstarlet can convert xml to pyx, which is, if not easy to parse, at least uniform in structure and better suited to tools like sed and awk.

http://xmlstar.sourceforge.net/doc/UG/ch04s07.html

There are probably other tools as well that can do the same.
 
Old 06-13-2012, 02:01 PM   #8
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
can you do something custom like pipe the output of a sql script thru cut, awk, sed, tr, ... ?

can you share a dump of a sample table with a few rows and columns so we can get an idea if this is bash-able ?

edit: you can play around with the mysqlshow command to get the db, table and column names then loop thru the list with mysql.

Last edited by schneidz; 06-13-2012 at 03:07 PM.
 
Old 06-13-2012, 04:02 PM   #9
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
Originally Posted by Skaperen View Post
databasename<delimiter>tablename<delimiter>columnname<delimiter>value<newline>

with the value encoded in some way so delimiter, newline, and binary bytes are encoded as some notation like backslash-octal or percent-hexadecimal or such. Documentation seems to suggest this is not possible.

Any other ideas?
That data format seems kind of strange, very garrulous, but you could
conceivably get that result w/ perl or awk, assuming that your mysqldump
looks similar to the sample I just created (I don't use MySQL - bleagh).

You get
Code:
CREATE TABLE 'blah'
  'a' type constraint
  'b' type constraint
  'c' type constraint
You can capture these ('blah','a','b', ... ) in a numbered array

When you encounter an
Code:
INSERT INTO `blah` VALUES ( ...
and can get the values only (per row, if there's not ),( in any column).



Cheers,
Tink
 
Old 06-13-2012, 11:31 PM   #10
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,369

Rep: Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753
I reckon you need to write some Perl to parse the mysqldump. That's basically an ASCII plain text file, so should be no problem.
In fact this kind of text/data munging is one of Perl's strong suits.

If you want to simplify the Perl a bit, ensure the dumps are set to create a new INSERT line for each DB rec (default mode iirc) instead of the (optional) multi-rec INSERT statements.

You can even just access the DB directly using DBI/DBD modules.
I've done it loads of times, even joining 2 DBs together to txfr data, ie ETL but no intervening temp files reqd

What exactly is the purpose here?
 
Old 06-14-2012, 12:58 AM   #12
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by schneidz View Post
can you do something custom like pipe the output of a sql script thru cut, awk, sed, tr, ... ?

can you share a dump of a sample table with a few rows and columns so we can get an idea if this is bash-able ?

edit: you can play around with the mysqlshow command to get the db, table and column names then loop thru the list with mysql.
I do not see myself writing any kind of SQL parser in any language (or XML). That's why I'm looking for it "already done" in some tool. I'd even prefer to avoid an SQL database where I can. But other things like Drupal and Mediawiki want it.

The mysqlshow command might have some uses in scripts if there is a way to get it to stop drawing boxes and just give a raw list of data.
 
Old 06-14-2012, 01:14 AM   #13
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by chrism01 View Post
What exactly is the purpose here?
To avoid writing a parser.
 
Old 06-14-2012, 01:17 AM   #14
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by David the H. View Post
xmlstarlet can convert xml to pyx, which is, if not easy to parse, at least uniform in structure and better suited to tools like sed and awk.

http://xmlstar.sourceforge.net/doc/UG/ch04s07.html

There are probably other tools as well that can do the same.
Interesting format (pyx). It looks like lexified XML. That may simplify the parsing enough to make it doable for a "parsing challenged programmer" like me (I can handle tab-delimited, though).

UPDATE:

I ran a test. I took an XML dump of a Drupal website database, and converted it to pyx format, then back to xml ... and repeated this 499 times. It did NOT converge. In fact, it collapsed to about 1/3 the size of the original. It appears to be lossy. Program bug?

Last edited by Skaperen; 06-14-2012 at 02:15 AM. Reason: UPDATE
 
Old 06-14-2012, 08:00 AM   #15
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
Quote:
Originally Posted by Skaperen View Post
The mysqlshow command might have some uses in scripts if there is a way to get it to stop drawing boxes and just give a raw list of data.
can you use somthing like tr "\|" "" on mysqlshow's output ?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to parse /etc/passwd by means of grep? PianoLinux Linux - Newbie 11 02-28-2012 10:43 PM
does PHPDoc parse database accesses? dping Programming 2 07-04-2009 12:47 PM
how can I format/parse the output of time? luusac Linux - Newbie 4 04-09-2009 05:18 AM
perl script to parse following format suomali Programming 11 09-24-2008 01:51 PM
Need help with grep, trying to parse/filter a file... patsnip Programming 4 08-29-2003 02:33 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 08:22 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration