[05:15:33] <hyperboreean> hey anyone around to help me out with an issue? I have task1 pushing data into a mongo db shard (details about orders) and after that, the same task1 pushes data on rabbitmq (only the order ids of the orders that were already pushed to mongo); task2 picks up the order ids, searches mongo for details about them, but randomly some order ids are missing from mongo. task1 checks lastError for every entry that has pushed to mongodb, but ...
[05:15:39] <hyperboreean> ... nothing wrong seems to happen. Any ideas why I get this behaviour ?
[05:20:29] <sirious> has anyone found an efficient way in pymongo to check the document size before updating and calculate if the update will exceed the document size limit?
[05:45:39] <jgiorgi> sirious: as a general rule of thumb if you're worried about the document size your document is too big, if you really want to you can use getsizeof on the individual elements in your document
[05:46:16] <jgiorgi> but like i said, you probably have something really big that should be stored with gridfs or you have way too much in one document
[09:24:52] <Nodex> if you need this functionality check out sparse indexes
[09:25:17] <shmoon> being unset is also considered same? huh ?
[09:25:30] <shmoon> sparse (boolean) – If true, the index only references documents with the specified field. These indexes use less space, but behave differently in some situations (particularly sorts.)
[09:39:59] <wromanek> I got a db designed by someone else and I'm wondering if the structure (which is complicated) is designed correctly
[09:40:43] <wromanek> I have a lot of sets embeded in Documents
[09:42:58] <wromanek> So i have document Service which contains SubServices and each SubService contains a lot other objects... now when i do db.Service.find({'subservices.id': 1}); I get all Services which contains at least one SubService with defined id
[09:43:48] <wromanek> But I need only one specified SubService... so I guess I need another Document which is called SubService and Service contains refferences to SubService
[09:55:07] <Nodex> wromanek : you really shouldnt use references
[09:56:39] <Nodex> if you need to return only a specific object in an array you can use the positional operator
[10:37:11] <umut_> I just enabled sharding on a few collections. Should I wait till sharding process to be completed before writing new data?
[10:38:23] <umut_> It's 50GB of data and probably it would take hours to complete sharding
[10:48:28] <skot> You can look at the load/performance of your system as an indicator but there is no requirement that you wait.
[10:49:10] <skot> The more IO bound you are, the more writing to those overloaded disks will cause things to slow down.
[10:56:47] <marcqualie> Does anyone know how I'd get a list of dbs from command line? I'm currently trying to use mongo localhost/test --eval "db.stats()"
[10:56:54] <marcqualie> but it only returns [Object obejct]
[10:58:45] <sinclair|work> anyone here doing anything with google drive, i have a authentication question
[11:54:44] <ndee> hi guys, I'm running a mongodb and one DB got so big, that I'm always out of memory. So I can't do a db.[mycollection].runCommand('compact'); Is there a way to reduce the size of an existing DB?
[11:55:46] <Gargoyle> ndee: Delete some stuff? and by out of memory, do you mean disk space?
[11:57:19] <ndee> Gargoyle, I have enough disk space but the collection can't be loaded, that's the error in the log file: Fri Mar 22 12:55:02 [conn1] ERROR: mmap private failed with out of memory. (64 bit build)
[11:58:24] <Gargoyle> ndee: Strange. I would have thought swap space would have been used. And it would have just been slower.
[11:58:49] <ndee> Gargoyle, the swapping isn't used as I can tell from "top"
[11:59:03] <ndee> the server has 4GB of RAM and that's also the size of the collection.
[12:00:25] <Gargoyle> ndee: Not sure I can help, but you might want to pastebin some more detailed info. System details, mongo version, startup options, etc. Someone where will probably know more.
[12:01:47] <kali> ndee: a "ulimit -a" run with the user that is running mongodb may help too
[12:27:48] <DinMamma> ndee: If downtime is acceptible I would just do a mongodump of all databases, remove files on disk and restore.
[12:28:05] <DinMamma> Ive found it to be the easiest way to work in the scenarios under contrains.
[13:06:00] <fommil> hi all - I'm getting different BinData when I create a UUID in the shell and via the Java API sending a UUID object. What is the correct Javascript to create a BinData to agree with the Java API?
[13:10:56] <ndee> DinMamma, alright, I will look into that, thanks
[14:26:13] <Nodex> anyone else noticed less memory usage and more swap usage since 2.4 upgrade?
[14:34:10] <houms> good day all. I am using this script http://pastie.org/7067355 to backup mongo, but this is for replica set mongo. I am wondering how to backup standalone mongo instances, as my script checks it is not master
[14:34:31] <houms> and if it is master than it does not backup the DB
[14:38:19] <houms> can i just cut the logic out that checks master status? I assume since it is standalone and the script is using mongodump utility to execute the actual backup that may be fine?
[14:56:53] <houms> also if i am using mongodump on a slave is it better to use the --oplog option?
[15:18:41] <hyperboreean> hey anyone around to help me out with an issue? I have task1 pushing data into a mongo db shard (details about orders) and after that, the same task1 pushes data on rabbitmq (only the order ids of the orders that were already pushed to mongo); task2 picks up the order ids, searches mongo for details about them, but randomly some order ids are missing from mongo. task1 checks lastError for every entry that has pushed to mongodb, but ...
[15:18:47] <hyperboreean> ... nothing wrong seems to happen. Any ideas why I get this behaviour ?
[15:21:26] <freezey> anyone interested let me know
[15:23:00] <grahamhar> hyperboreean: are you querying a secondary, do you have "slow" replication
[15:24:22] <skrammy> hey all. i have some highly embedded documents. is there a way to get a list of all key/value pairs that have a given key? I dont care where in the document they are
[15:25:25] <Nodex> skrammy : you will have to do it appside
[15:53:16] <houms> Nodex can I just use my existing script http://pastie.org/7067355 and just comment out the condition check for master and replica set status? also I assume since its standalone and running i should append --oplog to the dump command?
[15:58:31] <houms> so Nodex, in a 3 node replica set it does not matter which you back up? also i get the impression that backing up the slave is not a good idea?
[15:59:00] <houms> not sure why though? non of the slaves are on a delay and we are using the --oplog option
[15:59:57] <houms> i am just trying to determine what is the best method for our replica set and our one standalone box
[16:00:18] <houms> i was under the impression that backing up the slave was okay on the replica set
[16:00:52] <houms> bean i was saying that based on "Nodex's comment is it not counter intuitive to backup slaves?"
[16:02:28] <houms> any insight or documentation you can provide would be appreciated.
[16:02:55] <bean> houms: what he means is pick one and backup from it, i believe. Since they will all have the same data.
[16:03:07] <bean> i'd rather back up from a slave as to not affect the master.
[16:03:47] <houms> bean completely agree about backing up the slave, so we are using the 3rd box in the set to backup from. I just wanted to make sure this would not lead to bad backups
[16:04:01] <houms> but also wanted to confirm what to do in case of the standalone instance
[16:04:20] <houms> since my script is checking for rs status and whether it is master which in standalone it will always be
[16:05:01] <kb19> has anyone reconciled mongodb's embedded document feature with structuring a rest api around resources?
[16:05:33] <houms> my script currently is not written for standalone instances and I was hoping I could get confirmation on whether or not i could just use the dump snippet from the script without being concerned about rs status since it does not exist or whether or not it is master or slave since it is always master
[16:05:45] <houms> thanks again bean for your replies
[16:05:59] <houms> still learning all the mongo ins and outs.
[17:43:41] <w0rmdrink> someone told me that its bad to run multiple mongodb instances on one system
[17:54:40] <w0rmdrink> would the page replacement algorithm not essentially take care of what of which mongo instance is in ram and what is not
[18:25:33] <tg3> hey, is there a way to enumerate keys internally in mongodb? it seems that if you use long key lengths, your data bloat is huge. would it not be possible to use mongo's own key/value storage internally to store enumerated keys? ie: instead of storing "username" as plaintext (8 bytes) for every entry (millions of users), it would insert "username" into an internal key/val store which returned
[18:25:33] <tg3> an integer which it then used to store that record... upon read/write operations it would use this internal reference to translate plaintext into ID and do it's operations that way?
[18:26:07] <tg3> the alternative is to use short keynames, but this is not quick or efficient.
[18:28:20] <tg3> Disk space savings could be substantial with repetitive, long variable names in documents.
[18:36:43] <tg3> it sounds like either running compression in the FS under mongodb's data store (wtih zfs, although there seems to be several use cases that show this doesn't work) OR by doing it yourself with some kind of class
[18:37:10] <tg3> its storing all the keys as plaintext
[18:37:22] <tg3> I can only imagine the memory overhead on that too if you want to keep your data hot
[18:37:38] <tg3> even if it was enabled as an optional variable
[18:37:46] <tg3> i think many would make use of it
[18:39:04] <tg3> i'm in the process of writing a translator in PHP which basically does this enumeration for various add/find/remove tasks
[18:39:19] <tg3> using mongo itself to store the keys
[18:39:27] <tg3> seems like it would make more sense to do this internally
[18:42:58] <tg3> even if you assigned a 2 byte limit to the amount of keys, you could fit the most common 65,535 keys in a very small data structure (and small in-memory footprint too) per document collection you could save (keys_document * (avg_keylength-2)) * N_documents bytes per collection which is substantial in many cases
[18:45:58] <eka> anyone knows what this means? info DFM::findAll(): extent 0:7e000 was empty, skipping ahead.
[18:46:16] <tg3> it would be max 4223KB of overhead to keep a 2byte table of enumerated keys (for keys up to 64 characters long), which conveniently fits in L3/L2.
[18:46:58] <khinester> i am trying to use the mongo-perf, my mongodb path is located /usr/local/var/mongodb, but when i run $ python runner.py i get running . --fork --syslog --port 27017 --dbpath /Users/khinester/Sandboxes/mongo-perf/db and then Could not setup local mongod <type 'exceptions.OSError'>
[18:59:26] <kb19> has anyone nested embedded documents only to separate them into collections?
[18:59:58] <kb19> trying to figure out the best approach but there doesn't seem to be any consensus
[22:01:36] <RoboTeddy> is there a way to use `fields` to query for just one matched element out of an array?
[22:08:05] <RoboTeddy> oh hey: http://docs.mongodb.org/manual/reference/projection/elemMatch/ new in 2.2