PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 22nd of March, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[02:48:07] <penzur> howdy
[02:54:36] <freezey> looking for a mongodba
[02:54:40] <freezey> anyone want a job?
[05:15:33] <hyperboreean> hey anyone around to help me out with an issue? I have task1 pushing data into a mongo db shard (details about orders) and after that, the same task1 pushes data on rabbitmq (only the order ids of the orders that were already pushed to mongo); task2 picks up the order ids, searches mongo for details about them, but randomly some order ids are missing from mongo. task1 checks lastError for every entry that has pushed to mongodb, but ...
[05:15:39] <hyperboreean> ... nothing wrong seems to happen. Any ideas why I get this behaviour ?
[05:20:29] <sirious> has anyone found an efficient way in pymongo to check the document size before updating and calculate if the update will exceed the document size limit?
[05:45:39] <jgiorgi> sirious: as a general rule of thumb if you're worried about the document size your document is too big, if you really want to you can use getsizeof on the individual elements in your document
[05:46:16] <jgiorgi> but like i said, you probably have something really big that should be stored with gridfs or you have way too much in one document
[07:30:27] <Sgoettschkes> Morning
[07:51:55] <amitprakash> Hi, how do I create sparse indexes via pymongo?
[08:44:40] <[AD]Turbo> ciao all
[09:04:03] <sowe> hi all
[09:13:24] <wromanek> haalo
[09:22:03] <shmoon> look
[09:22:26] <shmoon> with unique index i cant have 2 document where the key is null ? or 2 documents where key is both unset?
[09:24:29] <Nodex> no
[09:24:39] <Nodex> null is considered a value
[09:24:52] <Nodex> if you need this functionality check out sparse indexes
[09:25:17] <shmoon> being unset is also considered same? huh ?
[09:25:30] <shmoon> sparse (boolean) – If true, the index only references documents with the specified field. These indexes use less space, but behave differently in some situations (particularly sorts.)
[09:25:34] <shmoon> hm
[09:25:54] <shmoon> i guess it still wont work for null documents
[09:26:03] <shmoon> but might work for unset ones
[09:27:52] <Nodex> null is a value so you can either not set it or work out another way
[09:34:06] <shmoon> hm shouldnt this be enough
[09:34:09] <shmoon> db.user.ensureIndex( { system_id: 1 }, { unique: true, dropDups: true, sparse: true } );
[09:35:21] <wromanek> Hi guys, is there any way to get SubService by id from Service document which contains many subServices?
[09:35:47] <Nodex> shmoon : yes
[09:36:06] <Nodex> wromanek : that doesn't make sense
[09:36:13] <Nodex> can you explain it in a different way?
[09:36:47] <shmoon> strange i put that in a file and $ mongo localhost:27017/collection bin/mongo.js - but nope :S
[09:38:28] <shmoon> i see, i drop it and then execute the file it works
[09:39:23] <wromanek> thats only an example
[09:39:59] <wromanek> I got a db designed by someone else and I'm wondering if the structure (which is complicated) is designed correctly
[09:40:43] <wromanek> I have a lot of sets embeded in Documents
[09:42:58] <wromanek> So i have document Service which contains SubServices and each SubService contains a lot other objects... now when i do db.Service.find({'subservices.id': 1}); I get all Services which contains at least one SubService with defined id
[09:43:48] <wromanek> But I need only one specified SubService... so I guess I need another Document which is called SubService and Service contains refferences to SubService
[09:55:07] <Nodex> wromanek : you really shouldnt use references
[09:56:39] <Nodex> if you need to return only a specific object in an array you can use the positional operator
[10:37:11] <umut_> I just enabled sharding on a few collections. Should I wait till sharding process to be completed before writing new data?
[10:38:23] <umut_> It's 50GB of data and probably it would take hours to complete sharding
[10:48:28] <skot> You can look at the load/performance of your system as an indicator but there is no requirement that you wait.
[10:49:10] <skot> The more IO bound you are, the more writing to those overloaded disks will cause things to slow down.
[10:56:47] <marcqualie> Does anyone know how I'd get a list of dbs from command line? I'm currently trying to use mongo localhost/test --eval "db.stats()"
[10:56:54] <marcqualie> but it only returns [Object obejct]
[10:58:45] <sinclair|work> anyone here doing anything with google drive, i have a authentication question
[10:58:46] <sinclair|work> ?
[11:08:01] <Nodex> sinclair|work : try #google-drive
[11:08:22] <sinclair|work> Nodex: popular channel that
[11:08:36] <Nodex> :)
[11:10:18] <sinclair|work> Nodex: https://www.youtube.com/watch?v=dwOv7vgizxg&feature=player_detailpage#t=423s
[11:10:20] <sinclair|work> found this
[11:29:09] <Nodex> not sure what it has to do with MongoDB
[11:49:04] <s1n4> hey, erlang driver doesnt work, why?
[11:50:41] <Nodex> lol
[11:50:42] <Nodex> 42
[11:54:44] <ndee> hi guys, I'm running a mongodb and one DB got so big, that I'm always out of memory. So I can't do a db.[mycollection].runCommand('compact'); Is there a way to reduce the size of an existing DB?
[11:55:46] <Gargoyle> ndee: Delete some stuff? and by out of memory, do you mean disk space?
[11:56:08] <Gargoyle> Morning Nodex! :)
[11:56:09] <ndee> Gargoyle, now, out of RAM
[11:56:20] <Nodex> Gargoyle o/
[11:57:19] <ndee> Gargoyle, I have enough disk space but the collection can't be loaded, that's the error in the log file: Fri Mar 22 12:55:02 [conn1] ERROR: mmap private failed with out of memory. (64 bit build)
[11:58:24] <Gargoyle> ndee: Strange. I would have thought swap space would have been used. And it would have just been slower.
[11:58:39] <Nodex> it should be paging
[11:58:49] <ndee> Gargoyle, the swapping isn't used as I can tell from "top"
[11:59:03] <ndee> the server has 4GB of RAM and that's also the size of the collection.
[12:00:25] <Gargoyle> ndee: Not sure I can help, but you might want to pastebin some more detailed info. System details, mongo version, startup options, etc. Someone where will probably know more.
[12:01:47] <kali> ndee: a "ulimit -a" run with the user that is running mongodb may help too
[12:27:48] <DinMamma> ndee: If downtime is acceptible I would just do a mongodump of all databases, remove files on disk and restore.
[12:28:05] <DinMamma> Ive found it to be the easiest way to work in the scenarios under contrains.
[12:28:10] <DinMamma> constrains*
[12:40:06] <Jester01> Hi people! Does db.collection.reIndex() recreate the automatic index on _id too?
[12:53:57] <skot> yes
[12:54:20] <Jester01> thanks
[13:01:13] <wromanek> Nodex: but am I able to get specific object using some conditions? I would like to avoid using foreach loops...
[13:05:17] <Nodex> yes
[13:05:23] <Nodex> with the positional operator
[13:06:00] <fommil> hi all - I'm getting different BinData when I create a UUID in the shell and via the Java API sending a UUID object. What is the correct Javascript to create a BinData to agree with the Java API?
[13:10:56] <ndee> DinMamma, alright, I will look into that, thanks
[14:26:13] <Nodex> anyone else noticed less memory usage and more swap usage since 2.4 upgrade?
[14:34:10] <houms> good day all. I am using this script http://pastie.org/7067355 to backup mongo, but this is for replica set mongo. I am wondering how to backup standalone mongo instances, as my script checks it is not master
[14:34:31] <houms> and if it is master than it does not backup the DB
[14:38:19] <houms> can i just cut the logic out that checks master status? I assume since it is standalone and the script is using mongodump utility to execute the actual backup that may be fine?
[14:56:53] <houms> also if i am using mongodump on a slave is it better to use the --oplog option?
[15:18:41] <hyperboreean> hey anyone around to help me out with an issue? I have task1 pushing data into a mongo db shard (details about orders) and after that, the same task1 pushes data on rabbitmq (only the order ids of the orders that were already pushed to mongo); task2 picks up the order ids, searches mongo for details about them, but randomly some order ids are missing from mongo. task1 checks lastError for every entry that has pushed to mongodb, but ...
[15:18:47] <hyperboreean> ... nothing wrong seems to happen. Any ideas why I get this behaviour ?
[15:20:25] <Nodex> safe writes?
[15:21:19] <freezey> looking for a mongoDBA
[15:21:26] <freezey> anyone interested let me know
[15:23:00] <grahamhar> hyperboreean: are you querying a secondary, do you have "slow" replication
[15:24:22] <skrammy> hey all. i have some highly embedded documents. is there a way to get a list of all key/value pairs that have a given key? I dont care where in the document they are
[15:25:25] <Nodex> skrammy : you will have to do it appside
[15:25:57] <skrammy> bummer
[15:26:37] <Nodex> it's a trivial thing to do appside
[15:27:11] <skrammy> yeah - and i can just store the info in mongodb or memcache
[15:41:31] <houms> can i back up a standalone mongo instance using mongodump?
[15:41:43] <houms> while it is running
[15:47:11] <hyperboreean> grahamhar: I have sharding, aren't these 2 different things ?
[15:47:46] <grahamhar> hyperboreean: with in each shard do you have a replicaset or just a single node?
[15:48:03] <hyperboreean> grahamhar: single node
[15:48:24] <grahamhar> hyperboreean: assumed you had replicaset :|
[15:48:28] <hyperboreean> Nodex: I already do safe writes, I'm using pymongo and passing in the safe=True parameter
[15:50:29] <Nodex> houms : yes
[15:53:16] <houms> Nodex can I just use my existing script http://pastie.org/7067355 and just comment out the condition check for master and replica set status? also I assume since its standalone and running i should append --oplog to the dump command?
[15:53:25] <houms> thanks Nodex for confirming
[15:54:18] <Nodex> is it not counter intuiitive to backup slaves?
[15:54:27] <Nodex> you only ever need the master
[15:58:31] <houms> so Nodex, in a 3 node replica set it does not matter which you back up? also i get the impression that backing up the slave is not a good idea?
[15:59:00] <houms> not sure why though? non of the slaves are on a delay and we are using the --oplog option
[15:59:05] <bean> why would it be a bad idea?
[15:59:57] <houms> i am just trying to determine what is the best method for our replica set and our one standalone box
[16:00:18] <houms> i was under the impression that backing up the slave was okay on the replica set
[16:00:52] <houms> bean i was saying that based on "Nodex's comment is it not counter intuitive to backup slaves?"
[16:02:28] <houms> any insight or documentation you can provide would be appreciated.
[16:02:55] <bean> houms: what he means is pick one and backup from it, i believe. Since they will all have the same data.
[16:03:07] <bean> i'd rather back up from a slave as to not affect the master.
[16:03:47] <houms> bean completely agree about backing up the slave, so we are using the 3rd box in the set to backup from. I just wanted to make sure this would not lead to bad backups
[16:04:01] <houms> but also wanted to confirm what to do in case of the standalone instance
[16:04:03] <bean> I can't imagine that it would
[16:04:20] <houms> since my script is checking for rs status and whether it is master which in standalone it will always be
[16:05:01] <kb19> has anyone reconciled mongodb's embedded document feature with structuring a rest api around resources?
[16:05:33] <houms> my script currently is not written for standalone instances and I was hoping I could get confirmation on whether or not i could just use the dump snippet from the script without being concerned about rs status since it does not exist or whether or not it is master or slave since it is always master
[16:05:45] <houms> thanks again bean for your replies
[16:05:59] <houms> still learning all the mongo ins and outs.
[16:06:09] <bean> heh me too.
[16:06:19] <bean> Just been sysadmining a while and have done some reading
[17:01:33] <houms> bean me too
[17:01:37] <houms> are you in linux env
[17:01:43] <bean> yep
[17:07:56] <houms> bean do you have any standalone instances of mongo that are being backed up?
[17:08:12] <bean> yep.
[17:08:16] <bean> just mongodumping them
[17:09:43] <houms> are you using oplog option?
[17:15:49] <bean> nope,
[17:16:00] <bean> tbh, the data in that mongo store is disposable
[17:16:11] <bean> and the stuff that isn't disposable is in replica sets
[17:24:04] <houms> that makes sense
[17:27:09] <houms> unfortunately my devs are a pain in my ass sometimes
[17:34:38] <w0rmdrink> hi
[17:43:41] <w0rmdrink> someone told me that its bad to run multiple mongodb instances on one system
[17:54:40] <w0rmdrink> would the page replacement algorithm not essentially take care of what of which mongo instance is in ram and what is not
[18:25:33] <tg3> hey, is there a way to enumerate keys internally in mongodb? it seems that if you use long key lengths, your data bloat is huge. would it not be possible to use mongo's own key/value storage internally to store enumerated keys? ie: instead of storing "username" as plaintext (8 bytes) for every entry (millions of users), it would insert "username" into an internal key/val store which returned
[18:25:33] <tg3> an integer which it then used to store that record... upon read/write operations it would use this internal reference to translate plaintext into ID and do it's operations that way?
[18:26:07] <tg3> the alternative is to use short keynames, but this is not quick or efficient.
[18:28:20] <tg3> Disk space savings could be substantial with repetitive, long variable names in documents.
[18:32:37] <nDuff> tg3: ...I don't know MongoDB's internals, but incidentally, BaseX implements exactly that.
[18:36:43] <tg3> it sounds like either running compression in the FS under mongodb's data store (wtih zfs, although there seems to be several use cases that show this doesn't work) OR by doing it yourself with some kind of class
[18:36:46] <tg3> is the only way as of now
[18:36:55] <tg3> but I think for my users table, about 40% of the disk storage space is the keys
[18:37:00] <tg3> if I look at the binary store for it
[18:37:02] <tg3> on disk
[18:37:10] <tg3> its storing all the keys as plaintext
[18:37:22] <tg3> I can only imagine the memory overhead on that too if you want to keep your data hot
[18:37:38] <tg3> even if it was enabled as an optional variable
[18:37:46] <tg3> i think many would make use of it
[18:39:04] <tg3> i'm in the process of writing a translator in PHP which basically does this enumeration for various add/find/remove tasks
[18:39:19] <tg3> using mongo itself to store the keys
[18:39:27] <tg3> seems like it would make more sense to do this internally
[18:42:58] <tg3> even if you assigned a 2 byte limit to the amount of keys, you could fit the most common 65,535 keys in a very small data structure (and small in-memory footprint too) per document collection you could save (keys_document * (avg_keylength-2)) * N_documents bytes per collection which is substantial in many cases
[18:45:49] <eka> hi all
[18:45:58] <eka> anyone knows what this means? info DFM::findAll(): extent 0:7e000 was empty, skipping ahead.
[18:46:16] <tg3> it would be max 4223KB of overhead to keep a 2byte table of enumerated keys (for keys up to 64 characters long), which conveniently fits in L3/L2.
[18:46:58] <khinester> i am trying to use the mongo-perf, my mongodb path is located /usr/local/var/mongodb, but when i run $ python runner.py i get running . --fork --syslog --port 27017 --dbpath /Users/khinester/Sandboxes/mongo-perf/db and then Could not setup local mongod <type 'exceptions.OSError'>
[18:59:26] <kb19> has anyone nested embedded documents only to separate them into collections?
[18:59:58] <kb19> trying to figure out the best approach but there doesn't seem to be any consensus
[22:01:36] <RoboTeddy> is there a way to use `fields` to query for just one matched element out of an array?
[22:08:05] <RoboTeddy> oh hey: http://docs.mongodb.org/manual/reference/projection/elemMatch/ new in 2.2