MongoDump and MongoRestore vs MongoExport and MongoImport
MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas.
In this blog, I am going to talk about the usage of ‘mongodump and mongorestore’ vs ‘mongoexport and mongoimport’. The blog is considering the fact that readers of this blog come from a MongoDB background and the terms used are closely related to MongoDB only.
When we work with an organization or work on a huge codebase with a hefty amount of data in the database, there are instances where we need to use these commands quite often, and there is a lot of thinking and sanity checks before using these commands as they manipulate the data and manipulation of large data is a task to be handled with care, as it could lead to downtime of database or redundancy in data which in turn could lead to a heavy loss whether in terms of tech (i.e. to re-stabilize the database again) or could lead to loss of users trust.
Let’s talk about why we need these utilities. From the name, we can conclude that mongodump and mongoexport must be used for exporting data from the database, and mongorestore and mongoimport must be importing data to database, but the important part is why multiple utilities for similar tasks. That is what we are going to understand.
mongodump
mongodump is a command line utility for creating a binary export of the contents of a database. mongodump can export data from either mongod or mongos instances; i.e. can export data from standalone, replica set, and sharded cluster deployments. Sample command to use mongodump is:
$ mongodump --uri=<MONGO_URL> --db=<DB_NAME> --collection=<COLL_NAME>
This is the command to be run for taking out the dump using mongodump. By default, a dump folder would be created and a database dump would be created inside the same folder. We need to pass either the MONGO_URL in --uri
parameter or we can use --user=myuser and --password=somepass
to let the utility connect to the MongoDB client. We can specify the database and collection to take the dump of using --db and --collection
parameters, and even we can use --out
parameter to change the dump directory name. Whenever we take the dump, for every collection, it generates two files, one is the .bson
file and one is .metadata.json
file. For more depth on mongodump, documentation can be referred to. The database backup can also be run into an archive file using the --archive
field.
The reason we use mongodump is that it creates binary copies and is more efficient (by efficient it means that it is faster than mongoexport and creates a more compressed data dump). Another reason is that since mongodump makes .bson
dump, so it preserves the BSON Rich Data Types. It also preserves the indexes and other stats in the .metadata.json
file, so once we are restoring the data, indexes will be created implicitly. Hence, generally, it is used (and recommended by MongoDB to be used) for scheduled data backups via cron or something else.
Note: We cannot use the --archive
flag and --out
flags together in the same mongodump command.
mongorestore
mongorestore is the complement of mongodump, and is used to import the data dumped out via mongodump or mongoexport. mongorestore is an insert-only operation, which means that it will not overwrite a document in the database that already exists.
Mongorestore allows us to do some partial restore options by specifying a collection, or list of collections, that we would like to restore to your database. If we only need to restore one collection we can use the --collection
flag in which we will need to specify the --db
flag with the database we are restoring to as well as a path to the collections BSON
file. It is recommended to use --nsInclude
flag instead of the collection flag.
If we want to test if a mongorestore
will work before importing the database we can use the --dryRun
to test if the restore will work as expected. With the --dryRun
flag, mongorestore will return the mongorestore summary information without importing any data.
$ mongorestore --db=mydb --nsInclude="staging.*" dump/
mongoexport
mongoexport works just the same way as mongodump, but it is less space efficient and produces data dump in a human-readable format i.e. JSON format, and hence more time taking as well. By default, mongoexport writes to stdout, but we can write the data to a file as well using the --out
parameter. But as we discussed in mongodump, mongoexport’s dump is not compressed, and it does not produce .metadata.json
so it can not preserve the indexes, and by default, it can’t preserve the rich BSON data types as well. Although in the newer version of mongo, mongoexport has the support to preserve rich BSON types by using --jsonFormat=canonical
parameter.
A benefit of mongoexport is that we can export out specific fields as well, using the --fields=field1,field2...
parameter and specifying all the values as comma separated. We can also use --skip, --limit
and --sort
in mongoexport to get filtered and restricted results. There are a lot of fields as well that we can use while we are using mongoexport. We can also use --query
parameters to get more filtered results/dumps. When we take dump using mongoexport, it becomes easy to upsert them in Database UI Applications like MongoDB Compass.
mongoimport
mongoimport is the complement of mongoexport and is used to restore the extended JSON, CSV, or TSV export created by mongoexport
, or potentially, another third-party export tool. The big difference between mongorestore and the mongoimport utility is the --mode
flag which has the options of insert, upsert and merge. From the names only we can understand that in the case of insert mode, we can only insert unique _id documents and will log the error in case of duplication. In case of upsert, if there already exists a document with the same _id, the new document will replace the previous document with the same _id. The third case is merge, where existing documents that match a document in the import file get merged with the new document.
Conclusion: The mongodump/mongorestore utilities are perfect for a mongo database backup and restoration strategy. These utilities allow us to dump and restore from a single archive file that can easily be compressed and stored or transferred via email. The mongorestore allows us to include or exclude certain collections from being imported and/or restored into the new database with the --nsInclude
and --nsExclude
flag options. The mongoexport/mongoimport utilities are great for working with collections within a mongo database and inserting/updating documents within those collections. We can use the --mode
flag in mongoimport to define if we want to insert, replace or merge documents from our imported file.
Thanks for reading, and you can reach out to me on LinkedIn and GitHub.