[00:38:12] <multi_io> could it be that spring-data identifies the properties in its @Documents via the defined fields, rather than using beans/BeanInspector???
[01:19:47] <jonwage> anyone ever gotten this error when connecting to mongod from php?
[01:29:39] <dstorrs> did you try this? : http://php.net/manual/en/class.mongoconnectionexception.php
[01:29:50] <dstorrs> maybe you need to specify the host / post / etc
[01:30:54] <jonwage> ok, it's gotta be something related to that. we have a default configuration for our app that works with most local dev environments but for some reason its not working on this new laptop running osx
[05:22:36] <rubynoob2> hi.. for the first time I am using mongodb to store json results from a webservice to my local mongodb database.. i finally got it working.. and it has exceeded my expectations.. the speed of mongodb is unbelieveably fast its just mind blowing..
[05:22:56] <rubynoob2> I am so excited.. i have no one to share this with because my friends are sql fans...
[05:23:06] <rubynoob2> so i share this with you strangers in this channel.. i'm shaking..
[05:23:42] <rubynoob2> shaking with disbelief how how fast mongodb is... i mean.. its so fast.... do I need to bother with adding memcached to my stack if mongodb is so god dam fast??
[05:47:43] <henrykim> is it good idea to use sha1 hash key as a shard key?
[05:48:04] <jwilliams_> it 's recommended that shard key is uniqie.
[06:25:07] <henrykim> How much Balancer's jobs do affect a performance of mongodb?
[06:25:39] <henrykim> for example, I have currently 2 shards which are having 320 chunks in each.
[06:25:53] <henrykim> and then I setup new shard additionally.
[06:26:24] <henrykim> how can I guess this rebalancing jobs are done?
[06:27:00] <henrykim> how can I guess this balancing jobs do affect a performance of mongodb?
[06:27:58] <wereHamster> henrykim: don't guess. Test and benchmark yourself
[06:27:58] <dstorrs> henrykim: no idea, but let me know when you find the answer. I'm getting into the same boat very soon
[06:31:20] <henrykim> in a router server, mongos print out ChunkManager's log like "Wed Jun 6 15:27:39 [WriteBackListener-search-ddm-test4.daum.net:27018] ChunkManager: time to load chunks for blog.article: 25ms sequenceNumber: 405 version: 128|1"
[06:32:26] <henrykim> it has sequenceNumber and version of it.
[06:33:34] <henrykim> sequenceNumber is increased number, I guess this is the clue of ChunkManager splits a single chunk (64MB) and then move it to a destination.
[06:33:50] <henrykim> it does copy and then delete.
[07:02:02] <mbuf> is there a recommended way to ensure that no duplicate inserts happen in a mongo database for a compound index?
[08:16:15] <eloy_> Hi, One question about Mongodb-php driver
[08:16:46] <eloy_> When I use query with $gte and $lte in debian host don't run
[08:18:16] <eloy_> But, If I use Ubuntu host work fine. I update (In debian host) to php 5.3.13 and compile last revision of php-mongo from github but don't run ok
[09:31:09] <NodeX> the only thing I can think of is the query is not running due to memory/index problems
[09:33:23] <jwilliams_> I encounter the issue 176 (http://code.google.com/p/morphia/issues/detail?id=176) is there any way to solution for that? After reading the issue, i do not see a solution provided.
[09:37:41] <eloy_> Any solution? Why their work in ubuntu host?
[09:38:37] <NodeX> more free ram maybe - I don't know - it's just a guess
[09:42:14] <bjori> eloy_: is it the same mongodb server, and two different php frontends?
[09:47:56] <yawniek> my slave constantly has delays between 20m and 2h, io wait about 10%, mongodb 2.0.4. what should i check to figure out why this happens?
[09:48:10] <eloy_> Ubuntu is 32 bits, debian are 64 bits platform
[09:50:40] <bjori> eloy_: that integer in your code is 64bit integer, not 32. so on your 32bit platform it will be totally different number
[09:51:01] <bjori> eloy_: do for example php -r 'var_dump(1338969265000);'
[11:07:26] <giskard> there is a way to specify safe=false in the config file instead of passing it in the URI
[11:08:22] <giskard> (and or do you know how to pass it using the casbah mongo driver)
[11:13:36] <mongonub> When stroring references/links, should the reference ids be stored as plain text or as ObjectIds? ie. photo_id: '4fcf364919f9bebce2000001' vs photo_id: ObjectId('4fcf364919f9bebce2000001')?
[11:14:38] <algernon> what would be the point of storing them as text?
[11:16:54] <Killerguy> I'm trying but it doesn't works
[11:18:02] <NodeX> check the log - see what went wrong
[11:18:31] <mongonub> Another question: I need to generate non-predictable ids for photos.. is it still possible to make use of ObjectIds or do I have to go plain text based keys?
[11:19:53] <NodeX> you can tell mongo what it's OID's are
[11:26:59] <simenbrekken> Has anyone had success running mongorestore over a SSH tunnel?
[11:27:06] <nicholasdipiazza> I have the document {docId:1, inner:{docId:2, value:'myval'}}. I Tried: db.tmp.find({"inner":{"$elemMatch" : {"value":"val"}} Doesn't work. What am I missing?
[11:27:20] <simenbrekken> I keep getting an authentication failure assertion: 9997 auth failed: { errmsg: "auth fails", ok: 0.0 }
[11:27:45] <nicholasdipiazza> I have the document {docId:1, inner:{docId:2, value:'val'}}. I Tried: db.tmp.find({"inner":{"$elemMatch" : {"value":"val"}} I am searching for all documents with inner.value='val'. Doesn't work. What am I missing?
[11:36:17] <mongonub> NodeX: So just that I understand.. I can generate random non-predictable OIDs by generating a 24 character string where each char is [0-9,a-d] and then casting that to ObjectId?
[11:36:43] <mongonub> Given that MongoDB makes some assumptions about ObjectIds, such as their ascending order, can this result in some unexpected behavior such as favoring some photos over others in some queries or something like that?
[11:46:03] <nicholasdipiazza> I have {docID:1, inner : [ {id:2, chunks:[{id:3, subType:'PO'}, {id:4, subType:'AUDIO'}]} ]} and I need to query all chunks with subType Audio. I tried this db.tmp.find({inner:{"$elemMatch":{"chunks":{"$elemMatch":{"subType":"AUDIO"}}}}}) but it just returns the full document. What am I missing?
[12:06:29] <mongonub> Because of how indexing works
[12:07:04] <edussooriya> is there a way to get stats like db.pages.find().explain() in update()
[12:16:30] <nicholasdipiazza> oops nope I'm blind. i'm not rockin. that still returns just the outer document.
[12:16:50] <nicholasdipiazza> does every mongo query just return the root document, even though you are searching for inner documents?
[12:26:10] <amitprakash> Hi, for a given collection with keys a and b, I want to find the counts [ number of rows ] where either a or b = some value x.. However, I want to do this for a list of such values of x, X = [x] , what would be the mongo query for this?
[12:29:50] <nicholasdipiazza> @amitprakash: db.tmp.find({DocID:1234}).limit(1)[0].inner.length; finds the array length of this --> {DocID:1234, inner:[{}, {}, ... ]}
[12:29:53] <nicholasdipiazza> not sure if that helps you
[12:31:11] <amitprakash> nicholasdipiazza, don't really understand that
[12:31:37] <nicholasdipiazza> oh sorry no what you are asking wasn't described by that.
[12:32:35] <amitprakash> currently I am doing db.collection.count({'$or': [{'a': x}, {'b': x}]}) while iterating over X, but I was wondering if this couldn't be done with a single query
[12:42:01] <mongonub> What is meant by invariancy of ObjectId?
[12:56:39] <mongonub> It is mentioned that UUIDs should be stored as BinData.. could anyone show an example on how to can convert SecureRandom.uuid to BinData in Ruby?
[13:41:53] <mongonub> wereHamster: BSON::Binary.initialize(SecureRandom.uuid, BSON::Binary::SUBTYPE_UUID) => NoMethodError: private method `initialize' called for BSON::Binary:Class.. not sure how to use that class
[13:43:02] <mongonub> oh wtf, I need to sleep.. initialize => new of course
[13:43:43] <skot> modcure: everything is memory mapped by the operating system. What the OS can fit in memory will be and the rest will only be on disk.
[13:44:21] <skot> The OS will flush to the background when mongod asks (every 60s by default) or when you do a safe write w/fsync or via the fsync command
[13:44:58] <modcure> skot, when mongodb fires up for the first time... VMM will grap the data files and store them in active(physial ram) and whatever doesnt fit place the rest in virtual memory(mapped files) on disk ?
[13:49:03] <modcure> skot, on the mongodb website is says: MongoDB uses memory mapped files for managing and interacting with all data. MongoDB memory maps data files to memory as it accesses documents. Data that isnt accessed is not mapped to memory.
[13:51:43] <mongonub> wereHamster: do you know what the UUID subtype does and if it expects some certain data format (bytes, hex, string) and/or length? documentation on the BSON website is scarse as well :(
[13:57:31] <skot> modcure: no data is lost but only data that is used will be read into memory and the OS will release memory for things not recently used when the memory is needed.
[13:58:38] <skot> mongonub: this might help: https://jira.mongodb.org/browse/JAVA-403
[13:58:49] <hillct> Good morning all. For some reason I can't find the .findOne( notation relating to $oid object ID lookups. Can someone remind me, or point me to the proper doc page? I thought it was db.myCollection.find({_id: {$oid: "myIdBSON-string-here"}})
[14:01:02] <skot> def is 27017 for mongodb wire protocol, +1000 for web admin port
[14:01:05] <modcure> skot, virtual memory is a block on hard disk acting as if it were physical memory. to me this would imply double the space for my data. no ?
[14:01:29] <skot> nope, best you should read up on virtual memory and memory mapped files.
[14:02:36] <skot> modcure: on my way out, best you talk to someone about memory in general as this has little to do with mongodb, or search the group linked in the topic here
[14:03:13] <hillct> NodeX: in this form? I get a JS parse error... collectionname.find({ "_id" : ObjectId( "4fcf5ee350b77eee45000001" )})
[14:05:18] <NodeX> remove the spaces and try again
[14:05:42] <NodeX> should work fine .. perhaps the spaces (i dont think so but stranger things have happened
[14:05:48] <souza> Guys, do you know some way to create OID to objects in array, for example i have this structure > http://pastebin.com/Ki1q37f4 but only my user object gets an OID if i run it, getting this > http://pastebin.com/3nY1tz50 i want that all objects in my bson generate an OID
[14:06:12] <NodeX> souza : your app has to do that
[14:09:44] <nicholasdipiazza> Hi guys. Here is a javascript mongo script that will update all members of an array. http://pastebin.com/mVEkzbC3 Is there no way to do this with one query instead of using a loop?
[14:09:45] <masone> Hey all. My mongo has a high page faults rate (100/s). The data is about three times the ram but the ram does not seem to get fully used. What's the correct way to determine the real working set size / real memory usage?
[14:12:31] <hillct> NodeX: my issue seems to relate to a modiication made to my queries when run through the mongoHub Mac Mongo client tool, so not a DB problem...
[14:13:42] <souza> NodeX: I'm trying it by mongo shell
[14:23:30] <NodeX> not an overpriced pretentious hardware fan tbh
[14:24:25] <NodeX> I'm just trolling coz i'm bored
[14:30:34] <souza> NodeX: I've created vars with OID inside and inform it object instantiation.
[14:32:35] <NodeX> dunno what that means but kudos ;)
[15:12:36] <senior7515> i see that 2.2 was release http://www.mongodb.org/display/DOCS/2.2+Release+Notes how to download the stable branch of 2.1 which I assume would be 2.2
[15:13:29] <kali> senior7515: stable is still 2.0.6, you're propably looking at a work in progress document
[15:19:28] <senior7515> kali: got you… hmmm i think I have a problem with the aggregation framework… it like process about 3/4 ths of the docs and then stops. I'm making the db operation with a write concern of SAFE… trying to track it down… any tips. I'm going to do some lastError check see if that returns something
[15:20:59] <kali> senior7515: i have absolutely no experience with the aggregation fw. let's hope somebody else around here can help you.
[15:24:02] <senior7515> kali: got you. Thanks…. weird. I have a collection of bunch of stuff, but its a collection of one type. call it an enum. I do a count on that collection and I get a .count() == 1200 but after aggregating that enum field it returns 900. the 900 docs it processes are correct. just wondering why it stops. Last error returns nothing. :(
[15:24:59] <deedubs> ovaillancourt: newb question, can you change the output of MR so its not {_id: ..., value: { mystuff: true}} and is more {_id: ..., {myStuff: true}};
[16:57:40] <senior7515> ranman: I was about to update that thread indicating that the 'count' field in the $group object does add up to be the total number of original documnts which is good.
[16:57:58] <senior7515> however it produces duplicate keys when I try to save the CommandResult (in java terms)
[16:58:08] <senior7515> duplicate keys exceptions, that is.
[17:01:06] <ranman> I'm not sure about that query sorry, I don't have the time atm to test that out, I'll post in your thread later if I do
[18:39:29] <souza> Guys i'm having a problem to install the C driver to mongoDB, i've downloaded the zip file, unzip it, and run the scons command, but if i try to compile some c file i got this > "fatal error: mongo.h: No such file or directory", meaning it's not finding the mongo.h file!
[19:01:48] <souza> ranman: i've tried with make, but when i run make test > "test/functions_test.c:71:15: error: lvalue required as left operand of assignment make: *** [test/functions_test] Error 1" the same error if i try scons test
[19:02:56] <ranman> souza: you ran make install as well? that test could just be failing right now, you can roll bag to another tag
[19:11:02] <ranman> there are probably a few bugs since you're on master but just roll back to the last stable tag if something seems weird
[19:44:09] <dstorrs> I've got a collection, channel_videos_grouped. It stores data about the aggregated statistics of various YouTube publishers' videos by day. I would like to be able to run MR over it once a day and have the results appended. So, on day 1, "cnn" had stats X. on day 2, "cnn" still had those stats, and also had stats Y
[19:44:48] <dstorrs> I was keying based on channel name to make the map and reduce easy, but that means the username ends up in _id so stats always get overwritten
[19:45:06] <dstorrs> I could change the key but is there an easy way to do this?
[19:45:54] <dstorrs> something like "in the finalize function, replace the _id field with a new ObjectID()"
[19:46:20] <dstorrs> which I thought of, but I'm not sure how to do / if it's safe. thoughts?
[20:00:12] <tonyk> in Node, why is var col = new mongodb.Collection(db, 'test'); slower than the callback version?
[20:00:45] <dstorrs> because callbacks are async and what you just wrote is not?
[20:01:21] <tonyk> well, selecting a collection is a "virtual" operation, isnt it?
[20:02:36] <dstorrs> you're going to have to excuse me. I'm hungry which makes me cranky. I can try to help, but you need to actually offer more information
[20:03:01] <dstorrs> and keep the cursing to yourself.
[20:04:11] <dstorrs> what is it that you're actually trying to accomplish?
[20:05:48] <tonyk> sorry that I offended your victorian sense of morality
[20:05:56] <tonyk> I'm trying to find out which is the best coding practice
[20:06:12] <tonyk> selecting a collection makes no sense for an async operation
[20:06:16] <wtmr> the best coding practice is to keep foul words to yourself
[20:07:35] <dstorrs> ...and I'm done here. I have no obligation to give free tech support to foul-mouthed little maggots who insult me after I offer an apology for a Poe's law violation.
[20:07:59] <tonyk> lol, keep in mind that I never called you names
[20:08:51] <dstorrs> ranman: "loi" I like that one. Much more accurate than "lol" or "rotfl", which people generally use when they are just kinda smiling quietly to themselves
[20:13:18] <dstorrs> wtmr: I'm not being an ass (well, ok, maybe, but not JUST being an ass for the sake of it). It really is what it means! http://dictionary.reference.com/browse/literal
[20:20:47] <kali> I couldn't find the HIMYM episode, so I had to fallback to xkcd :)
[20:21:57] <dstorrs> ObOnTopic -- in a m/r job, what is the best (safe / fast / mem-efficient) way to change the key from 'foo' to a new ObjectId() before final output?
[20:29:11] <dstorrs> clowny: so you're in the shell and you type 'show collections' and you see something called 'db.*' alongside more normal things like 'foo' ?
[20:29:34] <dstorrs> if so, that's odd. I suspect it was created accidentally at some point
[20:30:02] <kali> you might be able to access it with db["db.*"] in mongo shell
[20:31:10] <dstorrs> clowny: have you tried to create any other collections?
[20:32:56] <clowny> dstorrs: yes, there is no problem actually, I just would like to know what it means, because it shows up from anywhere
[20:36:36] <dstorrs> clowny: again, I tink it was an accidental creation
[20:38:15] <clowny> dstorrs: if so, no problem, thank you for your help
[21:20:54] <geoffeg> i keep getting an exception "Invalid argument" when trying to connect to a mongod in PHP ($m = new Mongo("mongodb://localhost")) that i know is running and fine. any ideas?
[21:23:18] <linsys> geoffeg did you read the php documentation?
[21:24:37] <linsys> if not then reference this http://php.net/manual/en/mongo.tutorial.php
[21:25:01] <linsys> also make sure mongo is loaded in your php.ini
[21:25:22] <geoffeg> see the "replica sets" section of mongo.connecting.php: $m = new Mongo("mongodb://localhost:27017", array("replicaSet" => "myReplSetName"));
[21:25:53] <linsys> hmm never did that... ok then make sure mongo is loaded into your php.ini
[21:26:05] <linsys> If you are connecting to localhost just do new Mongo();
[21:26:48] <linsys> You have this in your php.ini extension=mongo.so
[21:28:28] <geoffeg> yea, and i can verify it's loaded via phpinfo()
[21:28:53] <linsys> not sure then.. I've never done mongodb:// I always do "new Mongo()"
[21:29:17] <geoffeg> yea, i tried it without the protocol (mongodb://) too
[21:29:36] <geoffeg> it works fine as a command-line php script
[22:01:39] <Kryten001> Hi, I just wrote a javascript function that does a find and put the output in a new collection http://pastebin.com/zNBp7fuA
[22:01:53] <Kryten001> Is there a way I can load it in mongo and run it from C++ code ?
[22:20:58] <locojay1> hi what is the best way to chunk a collection and provide to workers? skip limit seem's natural but my understanding is that skip needs to traverse the comple colleciton. i have about 3m docs...
[22:21:56] <dstorrs> locojay1: I'm in the process of doing this now, actually.
[22:22:35] <dstorrs> Given that you say "...to workers", I assume you're using a distributed system? (Gearman, RabbitMQ, etc)
[22:23:46] <dstorrs> if so, what I've got is to have the manager run a low-processing thing that rolls through and saves jobs to the DB
[22:24:12] <dstorrs> then it tells the clients "ok, go execute on this job". the clients go to the DB for their args, etc.
[22:25:02] <dstorrs> So, basically, I punted on the "chunking it up" part and instead handled it by "do really low-effort job on all docs first, then have workers handle queueing naturally"
[22:26:32] <locojay1> manager's tell in my case go and get docs in collection from to. but skip() seem's to be slow so i was looking for an alternative. not like a seek on a file
[22:26:43] <dstorrs> you could shard the collection and then map/reduce over it
[22:27:26] <dstorrs> what is it that you're trying to do? and what is your data?
[22:29:06] <locojay1> large collection with reference's to gridfs. each workers does some natural language processing and sends part of the coll to elasticsearch
[22:32:03] <locojay1> maybe doing a sort on _id or somethings else and doing a range query with limit is better...
[22:33:15] <dstorrs> How are you currently queuing the jobs?
[22:34:00] <locojay1> shard seems like the best solution to avoid having a 2 large col
[22:34:12] <dstorrs> I wouldn't bother doing a sort.
[22:34:24] <dstorrs> and I would avoid limit, I think.
[22:34:39] <dstorrs> doesn't feel like the right solution
[22:35:06] <dstorrs> hang on. you have lots of workers on various machines, right?
[22:35:48] <dstorrs> why do you need to do the chunking from the manager? get a cursor to the collection and just throw jobs in the queue. it's zmq's job to get them distributed -- which is, by definition, chunking.
[22:36:46] <dstorrs> how you pass those jobs over is up to you -- as a protocol request / string over the wire / separate collection / whatever -- but I think it's your best option
[22:37:49] <compufreak> How do you search for a string inside a field? ex. find any objects that have "string" in the "title" field?
[22:37:55] <locojay1> yes sending the data to the workers via a queue is cleaner than having each worker connect to mongodb and get the data since skip is slow.....
[22:39:10] <dstorrs> no, I mean, have the manager queue up jobs with args like this: { name => 'nlp_user", user => "bob" }
[22:39:44] <dstorrs> the worker gets the queue request and says "ahah. when I get an 'nlp_user' command, I know to go to the DB and pull data for "bob" and do my stuff on it
[22:43:28] <dstorrs> All you care is the work gets done in time and accurately. Let zmq figure it out.
[22:47:37] <LesTR> hi, can someone help me with replicaSet problem please? We have a "large" replicaSet - 3 server's each has 16GB ram. ~200GB of data, index's size ~8GB. Now we have problem with slave delay, but only on one
[22:48:12] <dstorrs> LesTR: I don't know that I can help, but I'll try. But I need to see actual error messages / output, etc.
[22:48:36] <LesTR> 2 OK server's are absolutly identical and one problematic slave has +2 CPU(with TH)
[22:48:52] <LesTR> i don't have now access to ssh :(
[22:50:29] <dstorrs> but without some sort of specifics I can't really help
[22:51:01] <LesTR> me too, but its only one diferent
[22:51:35] <dstorrs> here's the only thing I can offer -- http://docs.mongodb.org/manual/replication/ It's the most recent docs. sorry it's not more, and good luck.
[22:52:34] <LesTR> can u give me please any hint for checking logs?
[22:52:34] <dstorrs> btw, you do understand that "eventually consistent" means "there may occasionally be small slave delays", right?
[22:53:05] <dstorrs> I'm assuming that you're actually seeing a significant ongoing issue.
[22:55:02] <dstorrs> LesTR: nothing comes to mind. I haven't done Mongo replication myself (hence the up-front "I don't know that I can help, but I'll try")
[22:55:47] <dstorrs> I will say this -- I can answer almost every question that comes into this channel, and that's based only on extensive reading. There's a lot of good info / docs out there.
[22:56:14] <dstorrs> just make sure what you're reading is doing about a version equivalent to yours
[22:59:16] <LesTR> sorry, my ssh connection on server with irssi died
[22:59:34] <LesTR> i have basic info about replication
[23:02:38] <rdegges> Yo! I'm using pymongo, and what I'd like to do is essentially an insert_or_get type operation: I'd only like to insert my new document if it doesn't exist in the database already.
[23:02:52] <rdegges> Is there a way to do that? I can't find any information in the pymongo documentation.
[23:03:31] <clowny> hello guys, does anybody knows if mongodb preferably keeps the indexes always on memory or it just do LRU ?
[23:04:02] <dstorrs> rdegges: insert_or_get ... that doesn't make sense to me. insert is a modifying op when you have data, get() is a retrieval op when you have partial data
[23:04:37] <rdegges> dstorrs: well, I'd like to only insert my data if it doesn't already exist.
[23:04:54] <rdegges> So instead of doing like, if blah.find_one({}): blah.insert({})
[23:05:02] <rdegges> I'd like to do it in a single operation.
[23:05:08] <dstorrs> ok...why not just do the insert() ?
[23:05:16] <rdegges> Oh, I should probably explain that, d'oh.
[23:05:22] <dstorrs> inserting the data twice is idempotent
[23:05:27] <rdegges> I'm looking specifically for a key.
[23:09:14] <dstorrs> You are in a maze of twisty little race conditions, all horrible.
[23:10:08] <dstorrs> I think your solution is going to have to involve refactoring your data model, not your application logic. but that's just a guess at this point
[23:10:58] <werkshy> hi guys. i'm having a problem retrieving only certain nested fields in pymongo
[23:11:09] <werkshy> db.logEventByHour.find({}, {"logs.hour":1}) works in teh mongo shell
[23:11:27] <werkshy> but passing fields = {"period" : 1, "logs.hour": 1} doesn't work in pymongo