Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I'm looking for a way to dump an entire database to a stream that can be easily parsed or grepped. The database engine (for now) is MySQL. And I do have full dump backup files, so something that can convert one of those (in massive SQL format) to a suitable output stream should work, too.
Basically the output should be one text line per data cell, with the database name, table name, column name, and data value. Alternatively I could probably work with a JSON format output. XML is less desired, but might still be usable.
I have full rights to access the database, but the MYSQL tools don't give me a program-friendly output of everything.
can make it as "machine-friendly" as the OP desires.
Cheers,
Tink
These seem to require the --tab=path (or -T path) option, and that option will not write to stdout as a stream. Instead it writes a bunch of files named per table. This also requires running on the same machine as the database engine. It's also unclear how the files will be named for multiple databases, or maybe it can only do one database.
with the value encoded in some way so delimiter, newline, and binary bytes are encoded as some notation like backslash-octal or percent-hexadecimal or such. Documentation seems to suggest this is not possible.
i think a custom script based on mysqldump mite generate the desired output.
Yeah. But how to get mysqldump to work in some suitable way appears to be the standing issue.
I think it is unlikely mysqldump is usable for this. I suspect this will require some independent program that can access MySQL. I also think it would be better if it can do that access at a layer below SQL processing ... e.g. a raw layer, since it's going to get everything.
with the value encoded in some way so delimiter, newline, and binary bytes are encoded as some notation like backslash-octal or percent-hexadecimal or such. Documentation seems to suggest this is not possible.
Any other ideas?
That data format seems kind of strange, very garrulous, but you could
conceivably get that result w/ perl or awk, assuming that your mysqldump
looks similar to the sample I just created (I don't use MySQL - bleagh).
You get
Code:
CREATE TABLE 'blah'
'a' type constraint
'b' type constraint
'c' type constraint
You can capture these ('blah','a','b', ... ) in a numbered array
When you encounter an
Code:
INSERT INTO `blah` VALUES ( ...
and can get the values only (per row, if there's not ),( in any column).
I reckon you need to write some Perl to parse the mysqldump. That's basically an ASCII plain text file, so should be no problem.
In fact this kind of text/data munging is one of Perl's strong suits.
If you want to simplify the Perl a bit, ensure the dumps are set to create a new INSERT line for each DB rec (default mode iirc) instead of the (optional) multi-rec INSERT statements.
You can even just access the DB directly using DBI/DBD modules.
I've done it loads of times, even joining 2 DBs together to txfr data, ie ETL but no intervening temp files reqd
can you do something custom like pipe the output of a sql script thru cut, awk, sed, tr, ... ?
can you share a dump of a sample table with a few rows and columns so we can get an idea if this is bash-able ?
edit: you can play around with the mysqlshow command to get the db, table and column names then loop thru the list with mysql.
I do not see myself writing any kind of SQL parser in any language (or XML). That's why I'm looking for it "already done" in some tool. I'd even prefer to avoid an SQL database where I can. But other things like Drupal and Mediawiki want it.
The mysqlshow command might have some uses in scripts if there is a way to get it to stop drawing boxes and just give a raw list of data.
There are probably other tools as well that can do the same.
Interesting format (pyx). It looks like lexified XML. That may simplify the parsing enough to make it doable for a "parsing challenged programmer" like me (I can handle tab-delimited, though).
UPDATE:
I ran a test. I took an XML dump of a Drupal website database, and converted it to pyx format, then back to xml ... and repeated this 499 times. It did NOT converge. In fact, it collapsed to about 1/3 the size of the original. It appears to be lossy. Program bug?
Last edited by Skaperen; 06-14-2012 at 02:15 AM.
Reason: UPDATE
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.