[01:43:52] <Nerp> do you guys happen to know the projected final release of 2.2.1?
[03:18:18] <zivester> Hi, I have a "Product" that has many "Attributes" .. what's the correct way to search products that have a specific attribute, lets say keyed by "name".. I currently only have products reference attributes
[03:44:30] <zivester> err.. I'm trying to find all products that have an attribute 'shoes' ... my Product has: attributes : [ { type : Schema.Types.ObjectId, ref: 'Attribute' } ]... I'm guessing this has to be a many to many relationship and I have to store Products on attributes as well?
[03:53:48] <zivester> just reading about embedding data, or linking (which I assume is what I'm doing in that version)... I'm guessing if I want to search by attribute.. I either need to embed all attributes directly into the product... or reference all products from attributes and perform the find() on those instead
[03:55:37] <IAD> zivester: at first you can find _id of db.attributes : $a = db.attributes.findone('name' : 'shoues') . and at second $products = db.products.find('attributes' : $a['_id'])
[03:56:12] <IAD> or yes, you can store attributes inside of products. it's also good way
[03:59:21] <IAD> it will be difficult to change an attribute name (in two collections) but more faster =)
[04:03:47] <zivester> thanks IAD.. now trying to figure out why my successive saves are creating new attributes even if I supply the same name :)
[06:22:38] <andoriyu> is it possible to put $limit inside of $group?
[08:34:26] <shingara> it's someone ask me if there are some plan a major version
[08:34:40] <shingara> I just check if it's plan or not, not I want it :)
[08:36:22] <kali> shingara: i must say it's hard from a technical / feature / operations pov to make the difference between minor and major version of mongo
[08:36:31] <kali> shingara: 2.2 brought more stuff than 2.0 for instance
[08:36:44] <kali> shingara: so i honestly think this is mostly communication stuff
[08:38:01] <kali> shingara: same, in the 1.x era, there was a version bump (1.2 ? 1.4 ?) that required a data migration... something you usually expect for major version only
[08:38:58] <shingara> kali: agree with you but it's not to a technical person my answer. so only the number count :)
[09:59:34] <Gargoyle_> If I am iterrating over a cursor and need to update every document, is there a recommended way?
[10:00:34] <NodeX> dont make a new connection each time
[10:00:40] <vnico> hi there, how can I increase the RAM usage in MapReduce operations to reduce hard disk usage?
[10:00:51] <Gargoyle_> I've noticed in the past, that changing the data while looping over the cursor will cause it to do more loops than items! It this OK or should I be doing something else? like writing somewhere else?
[10:01:44] <Gargoyle_> NodeX: Not worried about speed.
[10:02:07] <NodeX> then a simple set, update on each itteration will be fine
[10:02:32] <Gargoyle_> Just that say I have 10,000 documents, and I loop over the cursor with a foreach and modify and save() them, then the loop will often run well over 10,000 times.
[10:03:08] <NodeX> I dont see who it can, if it's fed 10k items it can only foreach() 10k times
[10:03:59] <NodeX> I would say that's a code problem then because I have never had that happen to me lol
[10:08:08] <idank> when doing a find({..}).count(), does it matter if I filter out some columns or is mongo smart enough to see there's a count() there?
[10:08:41] <ron> I'm fairly certain it's sequential.
[10:09:15] <idank> so if my documents are big I should filter them to only fetch _id?
[10:09:43] <ron> I'm not entirely sure, but you can run a simple benchmark to test it out.
[10:11:23] <idank> my collection is ~650gb in size, with 700k rows where one of the fields is an array
[11:32:09] <Gargoyle> A basic update loop:- https://gist.github.com/3945549
[11:32:52] <Gargoyle> Don't worry about Place_handler, it just allows static config of the db connection details.
[11:33:15] <Gargoyle> Output from that function:- "Processed 568 of 516 places."
[11:34:30] <NodeX> and the count is definately right?
[11:35:06] <Gargoyle> yup. The collection only has 516 items.
[11:35:29] <NodeX> 1 second let me build a quick test
[11:35:40] <Gargoyle> If I omit only the line that calls save(), then the loop runs 516 times.
[11:36:15] <Gargoyle> I've built a lot of these scripts over the last few months - and while its a little un-nerving. They all seem to produce the expected results.
[11:36:21] <NodeX> is there a difference if you call $unset instead of save?
[11:37:39] <NodeX> All I can think is that either an APC cache (if you're running one) is messing up $count or the variable is being overwritten globaly
[11:37:52] <Gargoyle> In this case, yes. It only loops the expected number of times. But IIRC previous scripts that have used $set showed the same behaviour (I may have used save() calls though - cant quite remember)
[11:38:25] <Gargoyle> I don't think I have APC enabled on this version.
[11:39:35] <Gargoyle> I have come across PHP "funny business" when updating an array that you are looping over - but the cursor isn't actually an array is it?
[11:44:36] <Gargoyle> NodeX: Running the script again (With no more entries requiring the unset) does not do any extra loops.
[11:49:12] <Gargoyle> I have a bit more to add to my script, and then I need to remove the flags subdoc. This time I am going to output the count and the id being saved for each iteration
[11:49:56] <NodeX> I get also expected counts for an upsert
[11:50:06] <NodeX> let me add save() to my wrapper a minute and test
[11:52:13] <NodeX> I get the same expected results with save();
[11:53:06] <NodeX> https://gist.github.com/3945669 <--- yields 10k updates and 10k itterations
[11:54:33] <Gargoyle> now I am only getting the expected number of loops!
[13:12:20] <idank> NodeX: any idea why that query is getting killed?
[13:21:08] <sirious> if i have documents which contain a key which is a list of dictionaries, how do i query for the existence of a specific dictionary in that list based on multiple k/v pairs in that one dictionary?
[13:22:31] <sirious> basically want to do db.a.find({'a': {'a': 'foo', 'b': 'bar'}})
[13:24:40] <kali> sirious: you're looking for $elemMatch
[13:39:46] <kali> idank: it's often you have to touch the data when you want a new way to query it
[13:40:15] <idank> perhaps, but this is a one time query to gather stats on my data
[13:40:27] <idank> I gave up on running it on all of it
[13:40:37] <idank> and now even on a sample seems hard
[13:57:46] <Freso> Is there a comparison somewhere of MongoAlchemy vs. MongoKit vs. "raw" pyMongo (vs. other drivers?) somewhere? I'm just starting out with Mongo, so just trying to find my way around. :)
[15:17:44] <rio{> which is fine since I don't have any data at this stage. But is there a better way to do this if I would find myself in that position again. Of course you shouln't get there but if.
[15:22:39] <NodeX> fora unique index you have to make sure it's unique
[15:50:58] <solussd> is it possible to, without actually updating in the database, see the result of an update (so I can validate the document) and then commit the update?
[15:51:48] <MongoDBIdiot> no transactions, no commits
[16:27:39] <remonvv> so even if you do w = "majority" or w > 1 you have to consider the reads eventually consistent because some members might not have the data
[16:28:02] <remonvv> unless you always read from the primary only
[16:28:13] <NodeX> remonvv : you work with video right?
[16:28:14] <remonvv> You don't really have any other mode kind sir.
[16:29:01] <astropirate> how is mongo's map reduce performance compared to others, my primary use of the database is, write some data, and make a whole lot of map reduces very often
[16:29:01] <remonvv> NodeX, not really. We provide second screen services for TV shows and films.
[16:29:48] <astropirate> Hadoop is an option but it seems to require multiple nodes
[16:29:54] <NodeX> aggregation is faster - if your app can do it in aggregation
[16:29:55] <remonvv> NodeX, software that provides additional info/interaction during live events. E.g. you're watching sports and real-time data or comments are shown on your iPad or you can play along with Weekend Millionaires from home, etc.
[16:30:11] <NodeX> ah ok, so you dont use rtmp for that then?
[16:30:27] <remonvv> No we have in-house tech for most of it.
[16:30:41] <kali> astropirate: hadoop can run standalone
[16:30:57] <astropirate> kali, have you had any dealings with riak?
[16:31:04] <astropirate> thats my second choice after mongo
[16:31:08] <remonvv> In order to keep it cost efficient we need to push content to a lot of clients effectively and RTMP wasn't a very clean fit at the time.
[16:31:20] <remonvv> astropirate, never do m/r on databases directly.
[16:31:57] <kali> astropirate: well, m/r ops are full table scans, you need oriented for bandwidth. db like riak and mongodb (and sql) are optimized for latency
[16:57:59] <mnaumann> i'm using mongodump (mongodb runs with --auth) and i'm using mongodump for the first time, trying to backup multiple databses where i have created the same non-admin user with the same password.
[16:58:23] <mnaumann> i'm using the --user and --pasword options with mongodump and it says that authentication fails.
[16:58:53] <mnaumann> i'm using the --username and --pasword options with mongodump and it says that authentication fails. << correction
[16:59:32] <mnaumann> i'm passing multiple databses to backup as --db="first_database second_database"
[16:59:51] <mnaumann> is something about this wrong? do i need to use an admin user to backup?
[17:01:05] <jedir0x> I have an entity collection, and a collection of comments that are associated with those entities. given a collection of entity ids, how might i pull the latest 3 comments (dateCreated) for each of the ids? This needs to be a realtime query, i can't use MapReduce.
[17:06:01] <Dennis-> use the aggregation framework
[17:30:26] <psychowico> I guess some of you have done projects on it. I thinking about using Doctrine ODM (I writing in php)
[17:31:22] <psychowico> but I just read than working with ODM is "not schemaless". what do you think about this? some of you use doctrine odm or some other ODM in php?
[17:32:30] <NodeX> avoid things that fix your schema
[17:35:15] <IAD> psychowico: my ODM uses __set and __get to get schema less
[17:35:37] <NodeX> also avoid unnecessary bloat in your ap
[17:41:32] <psychowico> I have some noob question about nosql working too. I had standard relation sql experience before and I thinking about some scenario:
[17:42:40] <psychowico> mhhh.. or maybe not. I just realize answer by myself :P sorry,sometimes it happens to me :P
[17:44:17] <jedir0x> at some point i'll add aggregation support to MJORM
[17:44:47] <jedir0x> and update the documentation, it supports any type for ids now, as well as polymorphic mapping.
[17:45:04] <jedir0x> aggregation via MQL would be nice probably too
[18:07:34] <patrickod> I have a large TTL collection on which I need to change the timeout
[18:07:46] <patrickod> is the only way to do this to delete the first index and then re-index?
[18:10:34] <jedir0x> ok, with the aggregate framework how would i limit the number of "commentIds" returned on the document to 3: http://pastie.org/5110428 ?
[18:37:31] <psychowico> maybe someone of you can show me your lightweight wrapper for php \Mongo using, if u have any opensource project that have something like this?
[18:49:58] <sirious> anyone else have issues doing a $pull with $elemMatch which contains an ISODate value?
[18:51:18] <Freso> psychowico: Why not try the default/standard PHP driver?
[18:51:32] <Freso> psychowico: Who knows? It might just turn out to be all you needed.
[18:52:30] <Freso> psychowico: And if you, after having tried it out, realise you need more than that, you'll most likely know *what* it is you need more, and you'll be able to pick more selectively among the wrappers out there - or make your own.
[19:24:13] <__ls> if i call find() on a collection, does the resulting cursor always return results in the same order?
[19:25:35] <Dennis-> if you depend on ordering, then use sort()
[19:30:10] <Dennis-> you are a serious developer or a tinkerer?
[19:31:11] <yawning> i did a hard reset without shutting down properly and now my db is missing all of its collections. i just see system.indexes and session_table
[19:31:23] <yawning> I do have a backup i could pull down. is there anything else I could do instead?
[21:12:34] <sinisa> is aggregation framework blocking , anyone
[21:24:56] <kian> i'm setting up a sharded cluster; is it possible to, in addition, have a node which is never queried but contains the entire data set? (for simplifying backups, for example)
[21:39:21] <camonz_> is there a way to update each record with a value within that record in a mass update ?
[21:42:51] <camonz_> something like referencing a value of an embedded field in the iterated object when specifiying the value to be updated
[21:54:20] <jgornick> Hey guys, looking at this gist https://gist.github.com/61b7c0266e88347d8968, why doesn't the $pushAll on the AtomicIssue cause a unique index violation if a task with the same title already exists?
[21:54:51] <jgornick> I'm assuming a pushAll would also cause unique index violations?
[22:53:51] <jgornick> Hey guys, looking at this gist https://gist.github.com/61b7c0266e88347d8968, why doesn't the $pushAll on the AtomicIssue cause a unique index violation if a task with the same title already exists?
[22:53:53] <jgornick> I'm assuming a pushAll would also cause unique index violations?
[23:13:04] <jgornick> Hey guys, is there a reason why $pushAll does not trigger a unique index violation?
[23:13:24] <Gargoyle_> I seem to be failing to connect to a replica set with php5.4 and driver 1.3.0beta2. Im getting "not master and slaveOk=false".
[23:13:56] <Gargoyle_> What's the correct way to specify slaveOk in the new driver (as setSlaveOk() is dep'd)
[23:16:16] <Gargoyle_> Bit confused, as our live servers running php 5.3 and 1.3.0beta2 are working. Is this something I have overlooked in my new php-fpm config? Any hints would be helpful.
[23:29:50] <jgornick> Am I wrong in saying that the update command in 00-mongo.js should cause a unique index violation? https://gist.github.com/61b7c0266e88347d8968
[23:35:00] <jgornick> Well, this sucks: https://jira.mongodb.org/browse/SERVER-1068