PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 11th of December, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:52:23] <BigOrangeSU> hi all, is anyone familiar with Mongoid, I was hoping to understand how it supports updates to embeded objects within an array. Does it override the whole array element?
[00:54:15] <Boomtime> unless it has a way of figuring out the differences it pretty much has to - meanwhile, if you added a new item in the middle of the array then there is no other option anyway
[00:58:22] <GothAlice> BigOrangeSU Boomtime : MongoEngine recursively wraps arrays and sub-documents in proxies that record modifications. Certain ODMs do, in fact, implement proper delta .save() operations.
[00:59:01] <GothAlice> As for Mongoid, I do not believe it does from what I remember of the last time I went spelunking through the code.
[00:59:44] <Boomtime> I do not dispute the ability to do this, but I see you've arrived at the very same line I initially said, thanks
[01:02:01] <GothAlice> Boomtime: Elaboration FTW. Inserting values into the middle of an array _can_ be "delta"d, but depending on parallelism of writes to one record can result in the inserted element not matching the expected position in the final record. (Sometimes this is an acceptable risk.)
[01:02:44] <zamnuts> question: if i have usePowerOf2Sizes enabled, and a chunk size of 1mb within GridFS, will a 1kb file consume 1mb or the closet power2 i.e. 1kb? (disregard the n/_id/chunkSize meta overhead)
[01:04:14] <zamnuts> s/closet power2/next highest power of 2/
[01:07:28] <Boomtime> GridFS does not pad out the file chunk documents, it only caps their size
[01:07:59] <Boomtime> regardless of the chunksize you use, the last chunk of a file will be the size of the remainder, likely less than the chunksize
[01:08:26] <Boomtime> at that point, the other options kick in, if you use powerOfTwo then that is applied
[01:08:28] <zamnuts> Boomtime, so a 1kb file with a chunk size of 1mb will only consume 1kb on disk, and the extra 1023kb will be free and usable, right?
[01:09:03] <Boomtime> the numbers may not be precsiely that, but loosely, yes
[01:09:24] <zamnuts> Boomtime, that is understandable, i'm simplifying to get the idea...
[01:09:36] <GothAlice> 1kb + metadata + BSON encoding + padding factor, specifically.
[01:09:40] <zamnuts> Boomtime, won't that increase fragmentation then?
[01:09:42] <GothAlice> Boomtime: Can you confirm the default "padding factor" on newly inserted documents for me?
[01:10:51] <Boomtime> GothAlice: I assume you mean "new collections", since new documents use the padding factor of the collection - also, it only applies when not useing powerOfTwo
[01:11:08] <GothAlice> Boomtime: I was asking that one for me. :3
[01:11:14] <Boomtime> i have no idea what the seed value is, it must be low though
[01:12:11] <Boomtime> zamnuts: very technically, yes, files of different sizes will cause a little fragmentation to occur
[01:12:22] <GothAlice> zamnuts: Fragmentation will mostly depend on the rate at which you delete things. And yes, unless you compact (defragment) it occasionally, a hole left behind after deleting a final chunk will only ever fit a final chunk of that size or less in the future. (As long as MongoDB's on-disk storage rules remain otherwise the same operating under powerOfTo.) *digs more docs.
[01:14:43] <zamnuts> Boomtime, GothAlice, thanks so much - that answers that question; aside: will a higher chunk size in gridfs increase my read/write throughput?
[01:15:13] <GothAlice> A ha. I grok. The default powerOfTwo strategy would likely decrease document fragmentation a little, but waste a little bit of space in the process.
[01:15:48] <zamnuts> GothAlice, that is correct, that effect is prominent in https://jira.mongodb.org/browse/SERVER-13331
[01:17:32] <GothAlice> zamnuts: To a point. It's something worth experimenting with and benchmarking: larger chunk size will reduce the ratio of overhead to data on-the-wire, but the wire protocol already uses getMore() operations and other tricks to chop your data up in ways you can't overcome. There'll be a break-even point in performance. Not sure if MongoDB does jumbo frames on IPv4 or not, but IPv6 mandates support for it.
[01:18:57] <GothAlice> (Optimization without measurement is by definition premature. :)
[01:20:59] <zamnuts> GothAlice, that makes pretty good sense; i did run a test, increased chunk size from 255kb (default) to just less than 1mb, and saw no performance increase during writes, but it improved 2x for reads, granted this was a local loopback test
[01:21:50] <GothAlice> Loopback cheats: it's a shared memory circular buffer—doesn't even touch the network interface card.
[01:23:00] <GothAlice> Even if it's on the same network, always test over the full stack, and it's best to match the test DB's performance as closely as possible to production's, so you'll know how things will behave when deployed. :)
[01:23:07] <zamnuts> with that in mind, given what you mentioned about IPv4/jumbo frames, i might not even get that since at that point mongo won't be the bottleneck~ ok, i'll have to do more testing, thanks for the sanity check tho
[01:24:45] <GothAlice> IPv6 is nice if you can get it; it's often easier to get it internally. Less overhead, larger packets possible out-of-the-box.
[01:25:35] <zamnuts> GothAlice, i'll have to look into that, the mongodb cluster is in a private vrack, so it is a very feasible change
[01:25:53] <zamnuts> (the prod and prepod ones that is...)
[01:40:18] <adoming> Hey all, I'm a noob at mongo, I wanted to get some feedback on this Schema. My use case - I am creating a collection of documents, in an one to many relationship, I want to create a collection of unique URLs to view the document, then each time a link is viewed I want to store analytics info about the view of the link. My question is, do I break up the sub objects of Links and Analytics like RDBMS or keep it as is?
[01:40:18] <adoming> https://gist.github.com/adoming3/f5c7886d1e236fa9916d
[01:59:26] <ggoodman> is there a mechanism by which I can migrate entire collections from one remote machine to another while skipping the dump/restore process?
[02:02:26] <Boomtime> ggoodman: maybe copydb will be helpful http://docs.mongodb.org/manual/reference/command/copydb/
[02:03:41] <zamnuts> ggoodman, mongoexport -d db -c coll | ssh mongouser@host "mongoimport -d db - coll"; # will pipe the export of localhost to remote mongod, that is if the problem is simply you not wanting to transfer files
[02:04:13] <ggoodman> zamnuts: helping me on every front!
[02:04:29] <Boomtime> note that mongoexport/mongoimport may not preserve all type information
[02:04:39] <zamnuts> ggoodman, make sure you test it first
[02:05:15] <zamnuts> Boomtime, perhaps mongodump/mongorestore then :)
[02:08:01] <Boomtime> yes
[02:09:37] <Boomtime> monogdump can write to stdout (only really good for piping) but this tactic doesn't achieve the goal of "skipping the dump/restore process", it merely makes that process avoid using the local disk (which may be the desired result)
[02:18:50] <ggoodman> can I do mongodump directly to the db dir on the target system to avoid the restore step?
[02:27:42] <cheeser> ggoodman: no
[02:27:55] <ggoodman> cheeser: thx
[02:28:00] <cheeser> ggoodman: would copyDb() work for you?
[02:28:03] <GothAlice> Well, sorta, using the ssh pipe method described earlier.
[02:28:12] <ggoodman> it failed partway through with some pipe error
[02:28:18] <ggoodman> cursor issue
[02:28:21] <cheeser> http://docs.mongodb.org/manual/reference/method/db.copyDatabase/
[02:30:09] <zamnuts> cheeser, that seems much easier, only problem is it doesn't do a snapshot, compared to mongodump --oplog ? or am i missing something?
[03:15:57] <ggoodman> Can anyone hint to me what tool would make sense to migrate all records matching a query from one remote host to the local instance?
[03:17:22] <cheeser> mongodump --query ===> mongorestore
[03:19:13] <ggoodman> cheeser: will that clobber the existing database? What I'd like to do is only update those documents that match the query from the source database server
[03:20:14] <cheeser> you can't do updates that way, no.
[03:20:41] <cheeser> you can do mongoexport/mongoimport but you have to be careful of types in your docs.
[03:20:53] <cheeser> json has far fewer types than bson
[03:21:56] <ggoodman> hrm
[03:22:32] <zamnuts> ggoodman, if you can do this in real time, an application tied to the oplog might be better...
[03:22:54] <ggoodman> zamnuts: that is a bit above my understanding
[03:23:07] <ggoodman> Neither database is connected to production instances atm
[03:23:37] <ggoodman> perhaps I will need to do it by code?
[03:25:39] <ggoodman> annoying this lack of a simple mechanism to move a subset of documents from one instance to another (overwiting any at the target)
[03:53:58] <zamnuts> ggoodman, https://gist.github.com/zamnuts/fb5c569c17a9f0806b9b
[03:57:49] <tomhardy> hi guys.... i've been looking at indexeddb to write client side applications and need a backend database to store it too which mongodb looks a good fit. Conceptually I'm finding it quite difficult to understand coming from a standard sql background (i've worked on many enormous sql systems). If you have a users table and you constently need to join say the users name to other tables.. how would you recplicate that functionality in mongod
[03:57:49] <tomhardy> b
[03:57:57] <ggoodman> zamnuts: that is fantastic. Thank you
[03:58:10] <tomhardy> do you denormalize the data? or iterate through and spider off a request per record to load the name?
[03:58:32] <ggoodman> zamnuts: not familiar with the raw driver's findAndModify syntax... just the two parameters and callback?
[03:59:15] <zamnuts> ggoodman, 3 parameters + callback: findAndModify(query,sort,doc,callback); RTFM: http://mongodb.github.io/node-mongodb-native/api-generated/collection.html#findAndModify :)
[03:59:45] <zamnuts> there's an optional 4th in there for options, which i'm using for {upsert:true,new:true}
[04:00:17] <ggoodman> so should I stick a sort in there or is that optional... and yes I'm reading :D
[04:00:34] <zamnuts> tomhardy, yes, denormalize the data, there are no joins in mongodb, instead you nest documents
[04:01:00] <ggoodman> zamnuts: just felt like it was missing the updated document
[04:01:05] <zamnuts> ggoodman, sort is required, but if you don't really care, just sort on something indexed, e.g. _id
[04:01:22] <tomhardy> zamnuts: ok.. so when you say update a usersname, you then need to find everywhere that name is stored and update it?
[04:02:42] <zamnuts> tomhardy, yes unfortunately, there are no foreign keys, so there is no other way... you could store the username as an ObjectId, and have a separate collection that maps ObjectId to username<string>, but you still cannot join and it will require 2 queries
[04:03:10] <Boomtime> tomhardy: how often do you change username compared to reading the current value?
[04:03:23] <tomhardy> i understand there are no joins.. but i mean you can implement a join by collecting up all the users values and then firing a separate query to colect the data
[04:04:25] <tomhardy> bootime: yeah rarely in that case, but really i'm trying to get conceptually how you would do something like that
[04:05:20] <Boomtime> most of the time you find you are better off optimizing the read condition - i.e embed the data - i ask the question to make you think about it
[04:05:44] <tomhardy> my real example is actually, i have Users(id, name) 300 WorkItems (id, user_id, title) 3000 per user TimeItems (work_item_id, starttime, endtime) 30 per day per user
[04:06:17] <zamnuts> user_id becomes the Users.name typically
[04:06:23] <Boomtime> store username in workitem _as well_
[04:06:41] <Boomtime> you can still have a users table as the canonical source
[04:07:15] <Boomtime> but your reads can be vastly reduced workload by storing the needed data inline with your common result documents
[04:07:45] <Boomtime> yes, updating a username now becomes a considerable task, which is why i ask: how often does a username change?
[04:09:44] <zamnuts> considerable yes, but not to be confused with scary, this is quite simple: workitems.update({username:'oldusername'},{$set:{username:'newusername'}},{multi:true});
[04:10:02] <georgeblazer> hello there, I'm having a weird issue after I migrated our DB to another piece of hardware
[04:10:03] <georgeblazer> Invalid BSONObj size: -286331154 (0xEEEEEEEE) first element: ts: Timestamp 1368605407000|1
[04:10:06] <georgeblazer> not sure how to debug it
[04:10:16] <georgeblazer> I have tried to reIndex indexes, but that's about it
[04:10:20] <georgeblazer> any interesting suggestions?
[04:10:48] <georgeblazer> I see the DB is there, but every time I try to do simple MapReduce function or anything more sophisticated, I get this
[04:11:14] <Boomtime> that's a corrupt file - how did you migrate it?
[04:11:39] <georgeblazer> Boomtime, I copied /var/db from the original host
[04:12:04] <Boomtime> mongodb was shutdown at the time?
[04:12:05] <georgeblazer> I ran out on the original host, so I could not the mongodump
[04:12:10] <georgeblazer> it was not actually
[04:12:21] <joannac> not a valid copy then
[04:12:36] <georgeblazer> should I shut down mongo on the origin host, and copy /var/db?
[04:12:38] <georgeblazer> that should do it?
[04:12:49] <tomhardy> zamnuts: it's fast to update millions of records?
[04:13:02] <joannac> georgeblazer: yes, and make sure it's a clean shutdown
[04:13:17] <georgeblazer> @joannac: clean? db.shutdown()?
[04:13:27] <joannac> and make sure you shut down the destination before changing its files too
[04:13:42] <georgeblazer> but copying /var/db should be fine right?
[04:13:52] <joannac> georgeblazer: db.shutdownServer()
[04:13:59] <joannac> georgeblazer: yes, on a shutdown mongod instance
[04:14:17] <georgeblazer> is there any way to "fix" corrupted DB?
[04:14:35] <joannac> you can repair it, but repair will get you back somewhere between 0% and 100% of your documents
[04:14:44] <joannac> just depends where the corruption is
[04:15:08] <georgeblazer> i have tried to do db.repairDatabase() but it didn't do anything
[04:16:31] <joannac> "didn't do anything" == "didn't make the error go away"?
[04:17:27] <georgeblazer> that's right
[04:18:04] <joannac> but the logs show the repair ran successfully?
[04:18:26] <georgeblazer> let me see
[04:18:36] <joannac> anyway, this is a moot point
[04:18:41] <zamnuts> tomhardy, depends on your mongod deployment, sharding, indices, replsets, write concern, etc.
[04:18:42] <joannac> just shutdown and copy again
[04:20:11] <georgeblazer> root@prod-mongodb-a3dbdc49:/var/log/mongodb# mongod --repair
[04:20:11] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] MongoDB starting : pid=19955 port=27017 dbpath=/data/db/ 64-bit host=prod-mongodb-a3dbdc49
[04:20:11] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] db version v2.0.8, pdfile version 4.5
[04:20:11] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] git version: a340a57af7cdda865da420704e1d1b2fac0cedc2
[04:20:11] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] build info: Linux ip-10-2-29-40 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
[04:20:11] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] options: { repair: true }
[04:20:11] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] journal dir=/data/db/journal
[04:20:12] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] recover : no journal files present, no recovery needed
[04:20:12] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] finished checking dbs
[04:20:13] <georgeblazer> Thu Dec 11 04:18:33 dbexit:
[04:20:13] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] shutdown: going to close listening sockets...
[04:20:14] <georgeblazer> Thu Dec 11 04:18:33 [initandlisten] shutdown: going to flush diaglog...
[04:20:21] <zamnuts> Ahhh, pastebin.
[04:20:44] <georgeblazer> sorry
[04:20:51] <georgeblazer> http://pastie.org/9773439
[04:20:56] <zamnuts> tomhardy, i'm hesitant to say "yes" ... do note that you will have a collection write-lock, and if it IS slow, you will incur yields
[04:21:28] <joannac> georgeblazer: you are on a terribly old version
[04:21:31] <tomhardy> zamnuts: yh
[04:22:20] <georgeblazer> joannac: sure, but I doubt it's a culprit
[04:22:24] <georgeblazer> unless you think it'd help
[04:22:37] <zamnuts> tomhardy, your historical/cumulative time on reads by embedding data will be much faster than the amount of time it will take to perform a "join" equivalent, so i wouldn't worry about the speed of the update
[04:22:48] <georgeblazer> i have a slave where everything works, but on the master it doesn't
[04:23:08] <zamnuts> hopefully that last statement isn't too negligent
[04:23:15] <georgeblazer> REALLY wanted to see if I can _fix_ the master w/o cutting over to the slave and/or losing the data post-upgrade
[04:23:32] <joannac> georgeblazer: the slave doesn't replica writes?
[04:23:48] <tomhardy> zamnuts: yeah i finding it very hard to work out an appropriate way of appraoching the problem coming from an sql backgorund
[04:24:04] <georgeblazer> joannac: yeah, it looks like replication is currently broken :(
[04:24:05] <joannac> georgeblazer: your version is old enough that I don't know if my knowledge applies to your version
[04:24:21] <georgeblazer> joannac, so you think upgrade is in order
[04:26:13] <zamnuts> question: how does everyone deploy their mongos instances? on the same node as the driver? on a dedicated vm/hardware? what about a cluster config: map to the mongos' in the driver, or put them behind 1 or several LBs?
[04:28:15] <georgeblazer> joannac:just upgraded the DB, but still in the same boat
[04:28:59] <zamnuts> georgeblazer, why are you hesitant with switching over to a secondary, then rebuilding the old primary?
[04:29:15] <georgeblazer> zamnuts: because there were some writes to the primary
[04:29:23] <georgeblazer> i'd love not to lose it if possible
[04:29:58] <zamnuts> georgeblazer, can you reply the oplog on the secondary from where the broken primary left off?
[04:30:03] <zamnuts> s/reply/replay/
[04:30:52] <georgeblazer> zamnuts: sorry, new to mongo, where is oplog
[04:30:55] <zamnuts> that way you can recover the delta
[04:32:10] <zamnuts> georgeblazer, in the local.oplog.rs colleciton
[04:33:37] <georgeblazer> > show collections
[04:33:37] <georgeblazer> me
[04:33:37] <georgeblazer> oplog.$main
[04:33:37] <georgeblazer> slaves
[04:33:37] <georgeblazer> sources
[04:33:37] <georgeblazer> system.indexes
[04:33:37] <georgeblazer> > db.slaves.find()
[04:33:38] <georgeblazer> { "_id" : ObjectId("50d0e7f0432390dc134d4f03"), "host" : "127.0.0.1", "ns" : "local.oplog.$main", "syncedTo" : Timestamp(1418072620, 1) }
[04:33:42] <georgeblazer> sorry, not a pastebin
[04:33:58] <georgeblazer> and that was `use local`
[04:34:53] <georgeblazer> zamnuts:?
[04:35:15] <zamnuts> use local; db.oplog.rs.findOne(); should give you something, gotta run from mongod shell
[04:36:03] <zamnuts> sorry, might be in oplog.$main, you're v2.0 yea?
[04:36:14] <georgeblazer> > use local
[04:36:14] <georgeblazer> switched to db local
[04:36:14] <georgeblazer> > db.oplog.rs.findOne()
[04:36:14] <georgeblazer> null
[04:36:18] <georgeblazer> zamnuts: yep
[04:36:30] <zamnuts> db.oplog.$main.findOne() ?
[04:36:42] <georgeblazer> db.oplog.$main.findOne()
[04:36:42] <georgeblazer> 2014-12-11T04:35:18.305+0000 error: {
[04:36:42] <georgeblazer> "$err" : "Invalid BSONObj size: -286331154 (0xEEEEEEEE) first element: ts: Timestamp 1368605407000|1",
[04:36:42] <georgeblazer> "code" : 10334
[04:36:42] <georgeblazer> } at src/mongo/shell/query.js:131
[04:36:50] <zamnuts> yayyy
[04:36:57] <georgeblazer> ok, what?
[04:37:01] <georgeblazer> what's yayy :)
[04:37:01] <zamnuts> well forget that idea~
[04:37:11] <georgeblazer> ah, my oplog is busted?
[04:37:31] <zamnuts> it appears to be
[04:37:35] <georgeblazer> well, it's not the end of the world, but i'd like to understand WTF
[04:37:39] <georgeblazer> and what's caused it
[04:37:58] <zamnuts> georgeblazer, it sounds like the raw db copy w/o shutting down mongod
[04:38:04] <zamnuts> like what was discussed before, no?
[04:38:14] <georgeblazer> zamnuts:right...
[04:38:17] <georgeblazer> what about...
[04:38:29] <georgeblazer> I see the data in the database that was inserted after the migration
[04:38:34] <georgeblazer> is there any way to extract it?
[04:38:47] <georgeblazer> also...
[04:39:13] <georgeblazer> do you think it makes sense to promote a slave that is good, and start replicating from scratch?
[04:39:15] <georgeblazer> or...
[04:39:25] <georgeblazer> shut down the old master, and copy /var/db?
[04:40:37] <zamnuts> if you can read it, then you can move it... perhaps not by normal migration channels however. might need to do a db.col.find({...});
[04:41:27] <georgeblazer> right, thx zamnuts
[04:41:45] <zamnuts> depends on 2 things: if the size of the db is "small" then replicating from scratch will be fine; if you cannot reliably copy a good db from /var/db, then you have no choice but to replciate from scratch
[04:42:06] <georgeblazer> another thing is, is there any way to find the most recent entries if I don't have created_at field?
[04:42:09] <georgeblazer> the DB is about 8GB
[04:43:07] <zamnuts> the recommended approach is: mongodump > mongorestore > replicate from point-in-time to play catchup; the point of mongodump/mongorestore is so you don't have to replicate EVERYTHING since you'll be starting from a large subset
[04:43:35] <georgeblazer> so I'd restore the old master on both new master and slave, right?
[04:43:44] <georgeblazer> and then how do you I start replicating at the right position
[04:44:01] <georgeblazer> also, can't i just copy /var/db instead of doing mongodump? mongorestore?
[04:44:24] <zamnuts> you don't have to specify from the right position, mongod will figure that out... if there were any interim writes before the mongodump and the repl setup, it'll get those (granted your oplog cap is large enough depending on your write volume)
[04:44:39] <georgeblazer> really? well, ok, thx zamnuts
[04:44:54] <georgeblazer> need to run it by the guy who owns the data, otherwise, i think i'm clear
[04:45:10] <georgeblazer> and, thx again
[04:45:20] <zamnuts> yes, u can copy from /var/db, but you gotta stop the mongod process that is using those files.
[04:45:34] <zamnuts> if you can't stop the process, your only course of action is a mongodump
[04:45:50] <zamnuts> if you're low on disk, you could pipe mongodump via stdout over ssh to a new node
[04:46:14] <zamnuts> or perform the mongodump from the remote host by connecting to your target over the write
[04:46:27] <zamnuts> s/over the write/over the wire/
[04:48:11] <zamnuts> georgeblazer, "so I'd restore the old master on both new master and slave, right?" that is correct (sounds like a 3-piece replset), but i thought your "old master" was corrupt?
[04:48:55] <zamnuts> georgeblazer, are you using a replset or a legacy master/slave configuration?
[05:02:00] <georgeblazer> my old old master is fine, but it doesn't have any space left on disk
[05:02:33] <georgeblazer> it's on EBS, so I can shut down mongo, copy DB to another instance, and copy the DB
[05:02:40] <georgeblazer> legacy master/slave
[05:04:11] <zamnuts> georgeblazer, going to be honest w/ you, i've never worked with master/slave config, only the current replset
[05:05:00] <zamnuts> georgeblazer, you got enough free space to work on? i.e. more than a few hundred mb?
[05:05:01] <georgeblazer> zamnuts:well, one way or another, it sounds like I can copy /var/db from the shutdown DB, to both master and slave, and hopefully the slave will start slaving
[05:05:21] <georgeblazer> zamnuts:not on the main EBS volume, I'd attach another volume
[05:05:30] <zamnuts> georgeblazer, derp, ok
[05:06:06] <georgeblazer> so it sounds like the main source of corruption was due to the DB not being shut down
[05:07:50] <zamnuts> georgeblazer, g/l
[05:54:56] <georgeblazer> how to query for newest records in the collection if created_at is not there
[05:56:46] <joannac> default _id includes timestamp, so can be used as an approximation
[05:57:09] <georgeblazer> sort by id?
[05:59:43] <zamnuts> sort([['_id',-1]]).limit(1);
[06:00:37] <zamnuts> the first 4 bytes are seconds since the unix epoch, last 3 bytes are a counter, so generally... _id can be used to sort by "creation" date
[06:01:01] <zamnuts> see http://docs.mongodb.org/manual/reference/object-id/
[06:01:05] <zamnuts> georgeblazer, ^
[06:21:10] <georgeblazer> is there any way to find an object whose BSON respresentation is a negative number?
[06:23:14] <Boomtime> BSON is a data encoding system, what do you mean by "is a negative number" in this context?
[06:23:44] <Boomtime> documents, arrays, integers, dates, etc, all have BSON representations
[09:32:27] <h3m5k2> I want to run a repairDatabase and understand that I need enough free disk to "hold both the old and new database files + 2GB". My question is; what is "database files" referring to in stats()? is it storageSize or totalSize os dataSize? This is crucil info as my storageSize is 30GB+ while dataSize is only ~1.5GB and I have about 10GB of free space.
[09:33:42] <kali> h3m5k2: storageSize is the current space occupied. so it's "old database files"
[09:34:17] <kali> h3m5k2: the new database will be at least "totalSize" (total is actually data+indexes)
[09:34:57] <h3m5k2> kali: ok, so the free disk space minimum is totalSize + 2gb
[09:35:39] <kali> h3m5k2: yeah. but to be on the safe side, i would count 2*totalSize + 2GB
[09:36:13] <h3m5k2> kali: ok. great thanks a lot for clarifying that!
[10:52:26] <alexi5> hello ladies and gentlemen
[12:24:49] <nofxx_> Here is MMS which statistic matters the most on writes? Lock %? How can I monitor if I'm loosing some write?
[12:25:06] <nofxx_> or slow on writes*
[12:29:02] <alexi5> I am new to mongo db and I wondering why is mongodb used instead of an ORM on a RDBMs ?
[12:30:00] <vincent-> alexi5: I think that question is widely answered over the Internet. You just need to Google it.
[12:31:04] <nofxx_> alexi5, ORM is to RDMS as ODM is to mongo
[12:31:18] <nofxx_> you'll going to problably choose a ODM to work faster...
[12:32:28] <alexi5> i found it mongodb is more preffered due to the impededance mismatch in converting objects to relational structures while objects are similar to documents, so less missmatch between the two
[12:33:05] <Derick> IMO, and ODM is now just a simple small layer checking against data types - not an ORM that needs to construct queries, pull in things from different tables etc...
[12:33:13] <nofxx_> alexi5, I heard the same of RDBMs... its really a matter of YOUR data
[12:33:14] <Derick> alexi5: yup
[12:33:44] <nofxx_> btw, have you guys played with tokumx?
[12:34:05] <Derick> not me
[12:35:04] <alexi5> I am thinking of using mongodb for a project I have to catalog network nodes on our network at work. catalog as in store info about configuration, location, its components and also pictures that engineers take of the equipment
[12:41:31] <alexi5> other than blogs and news sites, are there any other use cases for mongodb ?
[12:44:11] <nofxx_> alexi5, your catalog
[12:46:23] <alexi5> hmm ok. nice
[12:47:20] <alexi5> one other thing, is mongodb well suited for storing pictures and their metadata ?
[12:49:20] <nofxx_> alexi5, no one does that man... even in RDBMS, it's an old very bad idea
[12:49:35] <Derick> plenty of people do that, even using video streaming...
[12:49:40] <nofxx_> use S3 if you're online or just the filesystem if it's an intranet
[12:49:56] <nofxx_> Derick, really? so why FS is gonne for mongo?
[12:50:10] <Derick> nofxx_: Sorry, I did not understand what you said there.
[12:51:00] <alexi5> ok so store path with metadata, not pictures in mongodb collection
[12:51:27] <nofxx_> yup
[12:53:43] <nofxx_> Derick, I tought gridfs was gonne... my bad
[12:54:11] <Derick> "was gonne" ?
[12:55:00] <Derick> you mean, has been removed?
[12:55:25] <nofxx_> yup, didn't read anything about it for a while I guess
[12:55:42] <Derick> it's certainly still there
[12:55:47] <Derick> it's a driver thing though
[13:51:54] <remonvv> The general sentiment that you shouldn't store large amount of static binary data in a database is generally valid though.
[13:52:15] <remonvv> Oh, old discussion. Sorry.
[14:52:55] <alexi5> hello guys and gals
[14:56:54] <alexi5> if i have an application that inserts a value in an array in a document and also increments a field in that document, how does mongodb handles this without causing race conditions with multiple connections attempting to the same thing ?
[15:28:26] <GothAlice> alexi5: It doesn't. :) It does, however, give you a method by which you can implement your own locking semantics, if desired. Using update-if-not-modified and/or upserts.
[15:29:34] <GothAlice> alexi5: Basically, individual update operations are atomic (i.e. $set, $push, etc.) and each of these is applied, one at a time. Using update-if-not-modified you query for an expected value and only perform some update if that expected value is found. If another process snuck in an changed it since the last time the first process loaded it, the first process will fail its update in a predictable way. (nUpdated=0)
[15:30:45] <GothAlice> Upserts are "insert this if not found, update it otherwise" updates. This lets you avoid needing to be so worried about inserting vs. updating records that may or may not exist.
[15:45:25] <agend> hi
[15:46:43] <agend> what i want to achieve is insert multiple docs into mongo - but if document with this _id already exists I want it to be replaced - how can I do it?
[15:50:03] <cheeser> use upserts
[15:52:54] <agend> cheeser: and want to to be as fast as possible - and as far as i know there are no bulk upserts - right?
[15:53:10] <cheeser> that's probably correct
[15:53:33] <agend> cheeser: are there any tricks to use bulk insert for replacing?
[15:54:02] <agend> are there bulk deletes?
[15:56:49] <cheeser> sure
[16:00:33] <agend> cheeser: sory - sure what?
[16:12:41] <cheeser> "are there bulk deletes?" "sure"
[17:08:14] <alexi5> thanks guys
[17:16:09] <remonvv> Also, upserts aren't atomic iirc
[17:17:04] <remonvv> aren't -> aren't always
[17:24:01] <remonvv> https://jira.mongodb.org/browse/SERVER-12694
[17:25:55] <Rubemlrm> Good Afternoon folks
[17:28:55] <Rubemlrm> https://dl.dropboxusercontent.com/u/2079219/Untitled.jpg
[17:28:59] <Rubemlrm> i have one question
[17:29:04] <Rubemlrm> i have the following collection
[17:29:20] <Rubemlrm> how can i retrive only the object that its inside on the black box
[17:29:25] <Rubemlrm> and update that object ?
[17:30:28] <Rubemlrm> i want to update one element of questsItem
[17:32:07] <lqez> Rubemlrm: Sure, you can return partial document and also update.
[17:32:08] <lqez> http://docs.mongodb.org/manual/tutorial/project-fields-from-query-results/
[17:32:16] <lqez> http://docs.mongodb.org/manual/tutorial/modify-documents/#update-specific-fields-in-a-document
[17:35:00] <Rubemlrm> lquez i've tried this db.quests.find( { _id: '547dedfa27987158048b4567' }, { questsItem: { $slice: 1 } })
[17:35:17] <Rubemlrm> the query runs , but dont return anything
[17:35:36] <lqez> you have to wrap object with ObjectId().
[17:36:07] <lqez> like _id: ObjectId('...')
[17:41:48] <Rubemlrm> lquez and can i search for the id of inner object too ?
[17:41:53] <Rubemlrm> instead using $slice : 1 ?
[17:42:04] <Rubemlrm> but using something like questsItem._id : 1
[17:42:05] <Rubemlrm> ?
[18:52:16] <bttf> new to mongo ... is it possible for a node app to have something like a local instance of mongodb that is runnable, rather than installing it systemwide on the machine?
[18:52:27] <bttf> for development/testing purposes
[18:55:27] <syllogismos> "note" : "thisIsAnEstimate", "pagesInMemory" : 79799, "computationTimeMicros" : 18343, "overSeconds" : 770 these are the workingSet stats of our mongodb
[18:55:54] <syllogismos> what do the stats mean?
[19:00:04] <skot> http://docs.mongodb.org/manual/reference/command/serverStatus/#server-status-workingset
[19:00:27] <skot> http://docs.mongodb.org/manual/faq/diagnostics/#what-is-working-set-and-how-can-i-estimate-its-size
[20:42:20] <Thinh> Hi guys, if I have a collection with the structure: {id: ObjectID, courses: [CourseObjectID1, CourseObjectID2, ... ]}
[20:42:39] <Thinh> How can I select all objects from that collection that has say, CourseObjectID2 in courses field?
[20:45:38] <Thinh> nevermind, I need $elemMatch :)
[20:54:14] <blizzow> I dropped a lot of my collections (50 or so) and re-created them with sharding enabled. I enabled sharding while the collections were empty and then started to insert data. Almost 24 hours later, the chunks have still not distributed amongst my shards. Does anyone know what might cause the sharding to be so slow? On a side note, I have one collection that I cannot drop, because the "metadata is being changed, but I don't see the c
[20:54:24] <blizzow> not chunk, collection.
[20:57:02] <Rubemlrm> hi there
[20:57:07] <Rubemlrm> i have a little question
[20:57:15] <Rubemlrm> about findAndModify
[20:57:31] <Rubemlrm> http://pastebin.com/Xd3aGuSK
[20:57:34] <Rubemlrm> i have this querty
[20:57:41] <Rubemlrm> but says that only line 2
[20:57:49] <Rubemlrm> Unexpected token {
[20:57:57] <Rubemlrm> but i think thats its ok
[20:58:08] <Rubemlrm> i cant do that like i do ?
[21:32:53] <olivierrr> question: for a 'kik' type chat app, is it best to have a document per message or per session
[21:33:27] <olivierrr> session = chat between 2 or more members
[21:41:15] <jonasliljestrand> findOneAndModify is atomic right? :S
[21:48:41] <jonasliljestrand> If i perform findAndModify { multi: false }, can this document be read by any other connection at the same time? And when the write lock are released will the other connections get that document if it still matches their criteria?
[22:00:13] <jonasliljestrand> also, is it possible to lock a document in all databases for a time
[22:00:14] <jonasliljestrand> ?
[22:11:21] <dlewis> is there a general purpose Mongo -> SQL importer? I plan on generating a database from the SQL import, so it doesnt have to be too pretty
[22:12:34] <dlewis> this is for analytics purposes
[22:12:38] <dlewis> and data warehousing
[22:13:28] <modulus^> https://www.youtube.com/watch?v=Ku4qBiqDRS4
[22:13:36] <modulus^> where do i purchase one of those ^^^ slanty eyed groups?
[22:39:20] <SpNg> is there anyway to store an object key with a dot in it?
[22:41:05] <modulus^> backwhack or escape it
[22:41:30] <modulus^> whatever language you're using to store data should have a way to backwhack/escape a dot
[22:42:26] <cheeser> i don't think that's allowed by the server...
[22:42:49] <cheeser> it wouldn't be able to distinguish between a key and a nested document key
[22:43:41] <cheeser> nope. no dots.
[22:43:51] <modulus^> why can't it be escaped?
[22:44:02] <SpNg> cheeser: so do I need to sniff for all dots and convert them to some other character?
[22:44:49] <modulus^> SpNg: no you need to snort coke
[22:45:24] <cheeser> SpNg: more or less
[22:46:24] <modulus^> ya'll need jesus
[22:57:00] <blizzow> From mongoshell I want to run db.my_collection.find().sort( { _id:-1 } _.limit(1) and only return the objectID instead of all the records in the document. How do I limit the output to not show any of the records in the document?
[23:03:21] <modulus^> blizzow: you cannot parse the JSON output?
[23:04:46] <cheeser> blizzow: http://docs.mongodb.org/manual/reference/method/db.collection.find/#projections
[23:05:43] <modulus^> { excludefield: 0 }
[23:05:46] <modulus^> yeah!
[23:11:15] <dpg2> how do I quiry for something like { "object": { "id" : "19198dc5-5040-4034-9edb-d0e9cb3207e2" } } ? is it just { "object.id": "19198dc5-5040-4034-9edb-d0e9cb3207e2" }
[23:11:21] <dpg2> query*
[23:11:40] <dpg2> mongolab's syntax thing isn't helping me.
[23:11:52] <appledash> Xe, you're a mong
[23:11:55] <appledash> odb user
[23:28:11] <joannac> dpg2: yes
[23:28:45] <dpg2> ok, then mongolab isn't working properly. Thanks you @joannac
[23:28:52] <dpg2> s/thanks/thank/