PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 9th of July, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[02:57:44] <pasky_> Hi! Is there an easy way to make a newly inserted object have an _id that is a plain string instead of ObjectId? (Unfortunately, having it ObjectId seems to produce endless pain in conjunction with Meteor.)
[02:57:56] <pasky_> (I'm using pymongo to insert objects.)
[03:05:40] <pasky_> ok, adding '_id': str(bson.objectid.ObjectId()) to the inserted object seems to do the trick \o/
[03:34:49] <sdsheeks> evening
[04:09:52] <sag> hi all
[04:10:16] <sag> can anyone tell me which is the ideal method to integrate mongo with php
[04:10:27] <sag> i mean php-mongo driver or mongo REST api
[04:11:00] <sag> i am using php-mongo driver in production from Oct-2012
[04:11:41] <sag> it tooks time when i am firing say 500 commands in for loop
[04:12:13] <sag> using multi_curl i can fire some parallel HTTP calls to mongo REST
[04:30:12] <sag> anybody
[04:31:22] <sag> can anyone tell me which is the ideal method to integrate mongo with php
[04:31:37] <sag> i mean php-mongo driver or mongo REST api
[07:37:27] <[AD]Turbo> hola
[07:53:12] <TraceRoute> hey all, I'm looking at using mongodb to add redundancy to my infrastructure. Could someone confirm this is possible. I have servers at different locations and a parent mongodb which acts as the master. The servers at different locations should only have data which relates to their location. Is there some way I can use replication to only pull down records that relate to the location and also push local changes back to the master?
[07:56:36] <sag> i think that u have to manage in schema
[07:56:48] <sag> like geo specific collections..
[07:57:17] <sag> i mean u have to define collections in that way..
[07:59:29] <TraceRoute> ideally thats how the root schema would start…a location has a bunch of objects inside. I only want the one server to have replication of 1 root object
[07:59:32] <sag> i dont think this can be possible with replica sets
[08:00:01] <sag> u may have to do master slave
[08:00:14] <sag> no sorry..
[08:00:30] <sag> not even that..u will have to use shards..
[08:00:44] <sag> sharding on geo tag
[08:01:20] <sag> all app should pass their country's geo tag..
[08:01:50] <shmoon> hi
[08:02:01] <sag> hi
[08:02:17] <shmoon> i do db.coll.find()... and also db.coll.find({..}) , assign it to a variable, i need to access _id now n the mongo console but i am failing
[08:02:36] <shmoon> basically i want to get the date and time for each collection from find() off their _id (since i think _id has timestamp)
[08:02:43] <TraceRoute> thx sag, I'll look into it more
[08:03:33] <sag> @TraceRoute : why u need primary mongo..as u already have some store..
[08:03:50] <sag> @TraceRoute and u just want redundancy
[08:04:10] <shmoon> ObjectId("507c7f79bcf86cd7994f6c0e").getTimestamp()
[08:04:13] <shmoon> oops sorry for that
[08:04:16] <Derick> sag: it's "you" and it's also customary on IRC to refer to people with "TraceRoute:" and not "@TraceRoute"
[08:05:38] <sag> "Derick:" ok sir..i am new to channel didn't knew the convention..will keep that in mind
[08:06:14] <Zelest> I've actually thought about that Derick ..
[08:06:27] <Zelest> making a IRC-gui kind of.. sort of a client.. but with a somewhat twitter-interface..
[08:06:37] <Zelest> like, all channels and such in the same window
[08:06:39] <Derick> Zelest: gah, people can just learn things.
[08:06:50] <Zelest> but you use hashtags (channels) to communicate :D
[08:06:56] <Zelest> or.. the other way around..
[08:07:01] <Zelest> a twitter-interface like a IRC client :D
[08:08:19] <Zelest> Derick, so, what's popping.. anything new I've missed? :D
[08:08:46] <Derick> Zelest: dunno, I've been on holiday
[08:09:21] <Zelest> same, hence the question :P
[08:12:50] <shmoon> anyone?
[08:13:24] <sag> shmoon: u need all ids from all coll? or all ids from some coll?
[08:14:45] <shmoon> sag: lets do 1 by 1. lets say i do this db.coll.find({table_id:5}).limit() now ow do I get the _id from it and maybe store ina variable in the console?
[08:18:33] <sag> shmoon: fond always returns a document
[08:18:38] <sag> *find
[08:19:24] <shmoon> look at this sag http://pastie.org/8123793
[08:19:26] <shmoon> nothign
[08:21:20] <sag> shmoon: so u will always get whole document unless specified as, db.coll.find({table_id:5},{title:1}) , will only return id and title..
[08:21:40] <shmoon> hm
[08:21:42] <shmoon> it returns cursor
[08:21:50] <shmoon> so i have to do doc.next() it seems
[08:23:31] <sag> shmoon: check ur pastie, i have replied there..let me know if it works
[08:23:50] <sag> i mean http://pastie.org/8123793
[08:59:31] <shmoon> sag: you cannot reply on pasties i think
[08:59:40] <shmoon> no commenting system, no edit on the same url, nothing
[08:59:45] <shmoon> it contains what i wrote
[08:59:49] <shmoon> or maybe i am missing somethign
[09:14:13] <untaken> Anyone familar with the CPAN modules for MongoDB? What is the most efficient way to setup paging? When I check the cursor after with has_next, it seems to have pulled all the rows from the collection. There must be a more efficient way, when I set limit and skip? Maybe it's just the perl module, but was hoping someone may know around here?
[09:15:36] <Derick> untaken: that doesn't seem related to just the perl driver
[09:17:12] <untaken> Derick: I know, but wasn't sure where to look next... really was after some pointers :)
[09:17:23] <untaken> bit of channel attack there :/
[09:17:36] <Derick> seems odn is a bit flakey at the moment
[09:17:49] <untaken> odn?
[09:17:54] <Derick> freenode
[09:17:57] <untaken> ah yea
[09:17:58] <Derick> this IRC network
[09:20:00] <[AD]Turbo> Sorry, I must post my question again (split sucks)
[09:20:03] <[AD]Turbo> I have a collection (Items) with a 2dsphere index on a 'pos' field (db.Items.getIndexes() returns me "key" : { "pos" : "2dsphere" }, "name" : "pos_2dsphere") but when I query that collection db.Items.find({ 'pos': { $geoWithin: { $box: [ [0, 0], [50, 50] ] } } }).explain() I see that a "BasicCursor" is chosen instead of the index. Is there a reason?
[09:20:42] <Derick> [AD]Turbo: geowithin with box wants a 2d index
[09:21:07] <Derick> "$box" is a flatland feature, not a spherical one
[09:21:16] <[AD]Turbo> ah, but I have need to query that collection with $nearSphere too
[09:21:45] <Derick> [AD]Turbo: try geowithin + geojson where you construct the polygon yourself
[09:22:34] <[AD]Turbo> does the "2d" index support $nearSphere and geoWithin+box atthe same time?
[09:22:55] <[AD]Turbo> if so, i can use a 2d (not a 2dspere)
[09:23:19] <perplexa> http://pastebin.com/2rrzaXrR < can anybody please explain why i get this error? :(
[09:23:39] <Derick> [AD]Turbo: no
[09:23:46] <Derick> [AD]Turbo: use a 2dsphere, it's much faster too
[09:24:10] <Derick> perplexa: which MongoDB version do you use? You need atleast 2.2
[09:24:21] <perplexa> 2.5
[09:24:27] <Derick> uh?
[09:24:35] <perplexa> MongoDB shell version: 2.5.1-pre-
[09:24:39] <perplexa> server is 2.4
[09:24:50] <Derick> that is odd
[09:24:56] <perplexa> yeah ;/
[09:25:28] <Derick> i run 2.4.3 shell with 2.5.1-pre-
[09:25:37] <Derick> > db.auctions.aggregate( { $match : { 'bId': 893634 } } );
[09:25:37] <Derick> { "result" : [ ], "ok" : 1 }
[09:25:41] <Derick> works fine...
[09:25:48] <Derick> what does db.version() output?
[09:26:04] <perplexa> oh!
[09:26:04] <perplexa> > db.version()
[09:26:05] <perplexa> 2.0.4
[09:26:09] <perplexa> brb, slapping somebody
[09:26:11] <perplexa> :)
[09:26:13] <Derick> :D
[09:36:04] <shmoon> i have a question
[09:36:24] <Nodex> amazing
[09:38:26] <shmoon> http://pastie.org/8123977 - I need the k12 value for table_rows.reader = 'total no of children', how do i go about doing it
[09:39:43] <Nodex> in one document?
[09:39:55] <shmoon> ya this is 1 docment
[09:40:06] <shmoon> hm yeah get it as 1 document that would help
[09:40:32] <Nodex> loop through the array in your script, there is no mongo specific way to do this other than a map/reduce
[09:40:47] <shmoon> :(
[09:40:59] <Nodex> why are your integers cast as a string
[09:41:00] <Nodex> ?
[09:41:36] <shmoon> this data is coming from an html table with fields, i am not sure how i can cast proper int as int and rest as string
[09:41:40] <shmoon> i will show you an image of the UI wait
[09:42:21] <shmoon> Nodex: http://puu.sh/3yxXi.png
[09:42:23] <Nodex> you're telling me you blindly put data in your database without sanitising it?
[09:42:44] <shmoon> help me understand how to sanitize this then
[09:42:53] <Nodex> that's outside the scope of this channel
[09:43:08] <shmoon> i think just saving it as it is and htmlentities while printing sould be good enough
[09:44:14] <Nodex> ok, good luck :)
[09:45:04] <shmoon> please tell me if that has a problem?
[09:45:28] <Nodex> again, outside the scope of this channel
[09:46:42] <shmoon> :(
[09:49:07] <Nodex> application / database security and XSS / CSRF has nothing to do with MongoDB
[09:52:54] <Nodex> :P
[09:54:13] <moogway> hi, i am trying to use mongodb with python and wondering about ORMs... I prefer to use pymongo over anything else but it doesn't support declaring the model and I'm not sure how to tell mongo to create/ensure an index on a collection using my app. Do I have to create the db and collections manually using mongo shell?
[09:54:32] <shmoon> dude
[09:54:38] <shmoon> there must be some way to achieve what i want to
[09:54:40] <shmoon> rather than looping
[09:54:42] <shmoon> in app layer
[09:54:50] <shmoon> some easy query
[09:55:18] <Nodex> there
[09:55:21] <Nodex> isn't
[09:55:22] <Nodex> another
[09:55:24] <Nodex> way
[09:55:25] <Nodex> like
[09:55:26] <Nodex> in
[09:55:27] <Nodex> the
[09:55:29] <Nodex> database
[09:55:35] <shmoon> hm
[09:55:51] <shmoon> cant beleve this
[09:57:21] <shmoon> is this a planned feature, ownder if you know or not
[09:57:56] <moogway> I am using db.test.create_index([("FieldName", pymongo.DESCENDING)]) but isn't that going to create an index every time an instance of the app is called?
[09:58:16] <Derick> moogway: it's going to try - is there an "ensure_index" perhaps?
[09:58:29] <moogway> yep, there is
[09:58:50] <Derick> that's a cached version of create_index
[09:58:56] <moogway> but isn't ensure_index a sort of superset of create index?
[09:59:16] <Derick> well, it basically checks if the index exists (also in local cache) before creating one
[09:59:17] <moogway> okay, got it
[09:59:33] <Zelest> Derick, is there a way of ensuring a TTL index through PHP?
[09:59:37] <Derick> Zelest: yes
[10:00:01] <Zelest> Derick, like, I want to store my sessions in mongodb and use a ttl collection for it.. but I don't want to reindex it every page hit :P
[10:00:30] <moogway> thanks Derick, I like mongoengine because it is declarative but prefer to use pymongo for the sake of avoiding third party libraries
[10:00:35] <Derick> Zelest: sorry, ttl is on a collection, isn't it?
[10:01:11] <Zelest> Derick, huh? yeah?
[10:01:15] <Derick> hmm, no
[10:01:17] <Derick> that's capped, sorry
[10:01:28] <Zelest> ohh
[10:01:32] <Zelest> you meant like that.. yeah
[10:01:38] <Zelest> yeah, ttl is on indexes
[10:01:56] <Derick> Use the expireAfterSeconds option to the ensureIndex method in conjunction with a TTL value in seconds to create an expiring collection
[10:02:19] <Nodex> Zelest : use dbCommand() iirc
[10:02:21] <Zelest> Derick, but can I fire that on every page load?
[10:02:23] <Derick> http://php.net/manual/en/mongocollection.ensureindex.php
[10:02:25] <Derick> you can
[10:02:27] <Derick> sure
[10:02:35] <Derick> it's even documented
[10:02:42] <Zelest> oh
[10:02:45] <Zelest> sorry for asking then :)
[10:02:52] <Zelest> oh btw!
[10:02:57] <Zelest> very off-topic..
[10:03:16] <Zelest> but today it's 1 year without energy drinks! :D
[10:03:20] <Derick> haha
[10:03:34] <Zelest> yet I'm speeded like a freak :D
[10:04:16] <[AD]Turbo> how to remove the limit of 100 documents returned by the $nearSphere, is it possible? my nearSphere queries all have a $maxDistance parameter but Im'm not ensured that results would be less than 100 docs
[10:05:04] <Derick> [AD]Turbo: you can add $limit: 500 f.e.
[10:05:09] <Derick> let me double check that
[10:06:38] <[AD]Turbo> from official documentation only geonear command has an additional 'limit' / 'num' parameter support, anyway I don't see the reason of such limitation (usually .find returns all documents)
[10:06:54] <Derick> [AD]Turbo: ah, sorry
[10:06:59] <Derick> you can just use limit()
[10:07:22] <Derick> I'd suggest you use $near with a 2dsphere index though
[10:07:26] <Derick> http://docs.mongodb.org/manual/reference/operator/near/
[10:07:39] <Derick> ok
[10:07:44] <Derick> let me shut up and try it first
[10:10:20] <[AD]Turbo> I can't really understand, from a developer point of view, introduce such 'limit' limitation for geo queries (and not present for standard non-geo queries)
[10:11:17] <Derick> [AD]Turbo: because it's not a fast operation to have it unbound
[10:12:12] <[AD]Turbo> the same approach for standard queries could not be proposed for such geo-queries? memory pools?
[10:13:06] <[AD]Turbo> if I set limit to, for example, 100000000, does the mongodb allocates some memory of such amount or so?
[10:13:17] <Derick> uh, no
[10:13:22] <Derick> it's not a memory thing
[10:13:37] <Zelest> Derick, yay, the index part worked lovely. thanks :)
[10:13:47] <Derick> an unbound nearSphere/near needs to calculate the distance for *all* the points in the database if you don't have a maxDistance
[10:13:51] <[AD]Turbo> oh, so what if no memory related?
[10:14:02] <Derick> if you have 500, that's fine
[10:14:08] <Derick> but if you have 5 million, that's not find
[10:14:13] <[AD]Turbo> but if i have a maxDistance?
[10:14:13] <Derick> fine*
[10:14:31] <Derick> in that case, it still limits you to be consistent I suppose.
[10:15:25] <Derick> I can't see how limit can be used though... to incrase it
[10:16:36] <[AD]Turbo> but, maxDistance=someValue + limit = 10000000000 is a similar situation of not to have a maxDistance and have an unbound query
[10:17:04] <[AD]Turbo> if I expect to have thousands results
[10:17:05] <Derick> yeah, but limit is theoretical here
[10:18:25] <Derick> [AD]Turbo: $geoNear in aggregation supports limit
[10:18:37] <Derick> I'd suggest you use that instead until geoNear uses a real cursor
[10:18:39] <Derick> http://docs.mongodb.org/manual/reference/aggregation/geoNear/
[10:19:30] <[AD]Turbo> so geoNear would be the best option from a practical perspective?
[10:19:38] <Derick> yeah
[10:20:47] <[AD]Turbo> even without aggregration, I suppose
[10:22:37] <perplexa> sigh... what other arguments are there for aggregate over mapreduce other than reducing complexity and processing time? :|
[10:23:07] <perplexa> the dude responsible for upgrading claims 'blabla stable debian is more important'...
[10:23:20] <perplexa> such a harsh debate ;/
[10:23:55] <Nodex> is that a question?
[10:24:08] <perplexa> yeah, i need to convince him
[10:24:24] <perplexa> his only argument is that there's no stable deb packet in wheezy..
[10:25:53] <SomeoneWeird> what
[10:25:57] <SomeoneWeird> slap him
[10:26:15] <perplexa> i can't slap the cto... but his argument is a fucking joke
[10:26:16] <perplexa> ;p
[10:26:31] <SomeoneWeird> ah.. yeah that may be a bad idea.
[10:27:41] <Derick> perplexa: 10gen has its own apt repositories
[10:27:55] <Derick> running 2.0 is .. well, silly
[10:28:20] <Derick> running old versions in general is silly
[10:28:41] <kali> aha, a debien purist
[10:28:45] <kali> good luck with that
[10:30:56] <shmoon> i want to find all the documents where a particular field is existing/set ?
[10:30:58] <shmoon> possible?
[10:31:07] <Derick> perplexa: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-debian-or-ubuntu-linux/
[10:31:10] <Derick> shmoon: yes
[10:31:19] <Derick> shmoon: https://www.google.co.uk/search?q=mongodb%20exists
[10:36:51] <shmoon> mthanks
[10:38:42] <perplexa> Derick: yeah i've shown him that
[10:39:04] <perplexa> he's afraid of the 16mb pipeline output limit of the aggregation stuff now ;p
[10:39:15] <Derick> o_O
[10:41:01] <perplexa> what does 'output from the pipeline' exactly mean?
[10:41:14] <perplexa> is it the result?
[10:41:38] <perplexa> or the piped data between the pipeline operators?
[10:41:48] <perplexa> but that wouldn;t make sense
[10:41:52] <perplexa> i guess
[10:41:54] <kali> the 16MB applies to all intermediary states, but the pipeline is optimized
[10:42:25] <perplexa> kali gets the award for confusing me even more ;p
[10:42:34] <perplexa> i'm a newb :)
[10:42:44] <perplexa> may you elaborate, please?
[10:42:48] <kali> sorry, but real life is complicated :)
[10:43:00] <Derick> kali: are you sure it's for all states?
[10:43:12] <Derick> I've however never run into this limitation
[10:43:21] <kali> Derick: i have, plenty of times
[10:43:24] <Derick> 16mb is quite a lot!
[10:43:28] <Derick> kali: ok :-)
[10:44:05] <kali> to simplify: if you do a $sort at some point, the input needs to be stored in RAM completely before the $sort can actually sort
[10:44:21] <perplexa> 16mb seems not so much,considering my sql queries return 100mio rows sometimes ;/
[10:44:22] <kali> the input = the input of the $sort
[10:44:41] <Derick> kali: right, but not necessarily for match or project I thought
[10:44:48] <kali> Derick: that's right.
[10:44:57] <Derick> it's only sort and group that require it IIRC
[10:44:59] <sag> i guess this has changed in 2.4.5 as per release notes
[10:45:25] <kali> but then, the optimisations kicks in: of you do $sort followed by $limit, you only need to keep in RAM the size of the limit
[10:45:34] <Derick> yup
[10:45:35] <kali> Derick: yep
[10:46:29] <perplexa> > db.version()
[10:46:30] <perplexa> 2.4.5
[10:46:33] <perplexa> finally. :P
[10:46:34] <Derick> yay
[10:47:15] <sag> kali: as per release note "he MongoDB can perform a more efficient sort that does not require keeping the entire result set in memory"
[10:47:35] <sag> kali:ohh i missed your replies
[10:48:03] <sag> its seems i am on same page :)
[11:11:20] <huahax> hi
[11:11:21] <huahax> http://stackoverflow.com/questions/17546710/how-to-push-in-nested-array-in-mongoose
[11:51:58] <perplexa> rofl, turn on music, chill with headphones on, suddenly everybody starts looking into my direction, realise speakers are at max volume and headphones unplugged :D
[11:52:12] <Zelest> haha
[11:54:09] <borior> hi all, so I'm having trouble configuring a replicaset when the nodes are portmapped (NATed) behind another IP address
[11:54:45] <borior> specifically, in my argument to rs.initiate, I'm specifying the hosts with their "external" IP addresses: 192.168.X.X
[11:54:53] <huahax> ey, someone want to help me with some basic queries in nested arrays? :)
[11:55:42] <borior> then the node that's running initiate runs getMyAddrs(), which returns a list that doesn't include that address, and the replSet initialization bombs out with: "couldn't initiate : can't find self in the replset config"
[11:56:17] <borior> is there any way around this? why on earth does mongo need to know the IP on which it is exposed the rest of the world?
[12:15:53] <perplexa> kali: so, when i have db.auctions.aggregate( { $match : { bId: 893634 } } ) which yields a result >16MB, will it produce an error?
[12:16:09] <kali> yes
[12:16:17] <perplexa> that's... shit
[12:17:06] <kali> perplexa: well, all tools have an application domain
[12:37:12] <borior> so if my interpretation of this is correct, it's basically impossible to run a replicaset in which members refer to one another using NATed IP addresses...
[12:37:15] <borior> yay
[13:17:29] <dahankzter> Is it possible to have a more complex GeoJSON structure, say each point has time data associated
[13:17:57] <dahankzter> or is it locked to the Type/coordarray structure?
[13:59:45] <bdiu> Anyone interested in a full time gig in Bloomington, In w/ paid relocation? :-)
[14:10:32] <dahankzter> Is it possible to have a more complex GeoJSON structure, say each point has time data associated
[14:11:01] <dahankzter> Noone knows? The online GeoJSON parsers out there give ambigous info
[14:37:03] <richthegeek> I have an array of values [1,2], and I want to find all rows which contain any of those values in their own array... so a row might be {_id: 42, vals: [2, 3]}
[14:37:17] <richthegeek> is {vals: {$in: [1,2]}} the right query?
[14:40:34] <rspijker> richthegeek: from the documentation: If the field holds an array, then the $in operator selects the documents whose field holds an array that contains at least one element that matches a value in the specified array (e.g. <value1>, <value2>, etc.)
[14:41:07] <richthegeek> rspijker: yeah, it's working now - I was using $not instead of $ne !
[14:41:11] <richthegeek> thanks though
[14:41:25] <rspijker> sure :)
[15:33:07] <rodrigofelix> hi, where is persisted the replica set configuration?
[15:35:00] <harenson> rodrigofelix: in the "local" db
[15:35:04] <rspijker> local.system.replset
[15:35:07] <rspijker> if memory servers
[15:35:27] <rspijker> serves...
[15:36:12] <rodrigofelix> to reset the rs config the only thing I have to do is delete local db?
[15:37:04] <harenson> rodrigofelix: look at here http://docs.mongodb.org/manual/reference/replica-configuration/
[15:38:15] <rodrigofelix> ok, I did
[15:38:53] <rodrigofelix> what is the default replication factor of mongodb/
[15:40:37] <EmmEight> Hello
[15:40:39] <redsand> rodrigofelix: in a replica set i believe it is all
[15:40:54] <redsand> and then shards replicate within themselves but not across shards
[15:41:50] <rspijker> what do you mean when you say replication factor?
[15:43:08] <rspijker> because have only ever seen that term used in conjunction with WriteConcern
[15:43:46] <redsand> rspijker: good point
[15:44:21] <rodrigofelix> I mean how many replicas have the same data
[15:44:40] <rspijker> all of them do
[15:45:17] <zymogens> Am a complete noob to mongodb. Have a quick question. I have a few object arrays inside a mongodb doc. It seems that mongodb assigns an object_id to all of them. I realise every doc needs an object_id, but does every object inside a doc also need one? Thanks.
[15:45:20] <rodrigofelix> ok, so all replicas have the records, right?
[15:45:30] <rodrigofelix> * the same
[15:45:54] <rspijker> rodrigofelix: at any given time, not necessarily, due to replication lag. But on an idle system, yes
[15:46:06] <rodrigofelix> yeap, understood
[15:46:46] <rodrigofelix> is this configurable? could I say that I want to have, for instance, only two replicas storing a specific record, even if my cluster has 10 nodes?
[15:47:45] <rodrigofelix> I'm trying to have a similar env to compare mongodb with cassandra
[15:48:06] <rspijker> rodrigofelix: afaik you can't. But I am not sure
[15:48:43] <rodrigofelix> ok. I'll try to figure out if I can change cassandra to work like mongo, replicating all data in all nodes
[15:50:58] <rspijker> zymogens: it does not.
[15:51:30] <zymogens> rspijker: oh ok… thanks … is it easy to turn off
[15:51:42] <nickmbailey> rodrigofelix: if you try to make cassandra work like mongo it will not perform very well
[15:52:05] <rodrigofelix> :/ .. why nickmbailey?
[15:52:32] <rspijker> zymogens: you shouldn't have to do anything… If you insert a document into mongo and you don't include an _id, mongo will create it for you. For embedded documents this is simply not the case...
[15:52:45] <nickmbailey> rodrigofelix: because it isn't mongo
[15:54:31] <rodrigofelix> :) .. ok, but what is the aspect of cassandra architecture (or strategy) that you are considering to say that cassandra does not perform well when all nodes have that same data?
[15:54:34] <zymogens> rspijker: thanks. there seems to have been an object_id for every object I have in each document. Not sure how they got there… Will look into it some more.
[15:55:17] <rodrigofelix> there are some points that I need to align when benchmarking cassandra and mongodb and I think that replication factor is one of them
[15:55:23] <nickmbailey> well it would probably do fine if you only have 3 nodes (3 is a very common replication factor) but if you are trying to apply that generally (with N nodes) then you are missing the point of cassandra
[15:55:42] <rodrigofelix> ok, understood
[15:56:23] <rodrigofelix> I think about varying from 3 to 5 nodes in my experiments, both on cassandra and mongodb
[15:56:36] <rspijker> zymogens: you can easily test it. db.Collection.insert({}) will create a document with an _id. db.Collection.insert({"c":{}}) will create a document with an _id and an empty subdocument "c" inside.
[15:56:44] <rodrigofelix> I'll try to run both with their default config and gather some results
[15:57:10] <rspijker> why do you want such a large number of replicated nodes if you don't mind me asking?
[15:59:15] <zymogens> rspijker: Is that a question for me?
[15:59:30] <rspijker> no, for rodrigofelix
[15:59:47] <rodrigofelix> well, I could have less replicated nodes. my main concern is trying to have a fair comparison among cassandra and mongodb, trying to have similar config although I know this it not that easy, since they have many different strategies
[16:00:33] <rodrigofelix> my idea was changing cassandra, because I can't see (for now) how to change replication factor of mongodb
[16:01:37] <rspijker> if you want to compare them, then you shouldn't look at having the exact same amount of nodes and replication in both cases (imo). You should look at a similar setup in terms of failover...
[16:01:42] <rodrigofelix> but I believe the best I can do is to benchmark with default configs of both and then compare how elastic they are according to some metrics I'm defining
[16:01:59] <zymogens> rspijker: Just tried it out there… Seems to not need an object_id… Thanks
[16:02:01] <rspijker> as in, if you have 9 cassandra nodes with a replication factor of 3 then you could have 3 mongo shards with 3 replica set members
[16:02:50] <rspijker> anyway, I got to go. Good luck :)
[16:03:10] <rodrigofelix> I understood your point. I'm gonna think about it
[16:03:16] <rodrigofelix> it makes sense
[16:03:24] <rodrigofelix> thanks :)
[16:04:15] <michael_____> hey, in the current mongod version, is safe=true per default activated (for unique indexes)?
[16:09:39] <dahankzter> Is it possible to do something like this {"legs":1,"legs":{"$slice":1} to to get only the first element of the array "legs"?
[16:11:18] <dahankzter> The other way seems to work {"legs":{"$slice":1},"legs":1} :)
[16:29:16] <ScottBPX> Is this support for mongoDB?
[16:30:39] <ScottBPX> Hello
[16:41:10] <huahax> hi
[16:41:29] <huahax> anyone have experience with nested arrays?
[16:55:39] <zymogens> Hi, Am using Mongoose. Object_IDs are being automatically generated for nested objects in a doc before I do a save() … Any idea how I'd prevent them being generated?
[16:58:12] <huahax> anyone have experience with nested arrays?
[17:06:18] <huahax> is it possible to push to an array in another array..?
[17:08:09] <mmlac-bv> Can a compound index index a -> b -> c -> d replace having another index a -> b -> d?
[17:09:52] <mmlac-bv> And does the order matter to a query? Or is a index a -> b -> c as good as c -> b -> a if the query is i.e. find id: a where b=1 and order by c?
[17:13:56] <huahax> no one here :<
[17:28:49] <huahax> 'posts.$.comments.0.text': "a comment"
[17:28:57] <huahax> anyone have a clue to get this dynamic? :)
[17:29:18] <huahax> i use $set
[17:30:04] <kali> huahax: yeah, we're shying away from double-nested arrays :)
[17:30:13] <huahax> haha
[17:30:17] <huahax> i've googled my ass off
[17:30:28] <huahax> it seems that the $ operator doesnt reach that far :'(
[17:30:46] <kali> mmlac-bv: it DOES matter.
[17:31:07] <huahax> u skilled with mongo kali?
[17:31:47] <kali> huahax: i guess i am
[17:31:51] <mmlac-bv> kali performance-wise or "mongo does no longer use this index" wise?
[17:32:11] <huahax> nice
[17:32:17] <kali> mmlac-bv: both
[17:32:38] <mmlac-bv> hm, ok, I can see the range at the front being bad
[17:32:49] <mmlac-bv> i.e. ail not use the index
[17:32:52] <mmlac-bv> will*
[17:32:54] <huahax> would u say it is practically impossible at the moment to make a nested array dynamic?
[17:32:59] <kali> mmlac-bv: if you want the index to be efficient, you need mongodb to be able to anwser the query without making too many random accesses
[17:33:17] <mmlac-bv> well the issue right now is the indexes are so massive they won't even fit into memory
[17:33:21] <mmlac-bv> so I gotta weed out
[17:33:27] <kali> mmlac-bv: waw, this is bad :)
[17:33:30] <mmlac-bv> yep
[17:34:28] <mmlac-bv> thanks for your help
[17:34:49] <kali> huahax: yeah... it's gonna be tricky
[17:35:00] <huahax> okey :<
[17:35:26] <huahax> i guess i'll redesign my schema then
[17:35:44] <kali> mmlac-bv: you don't need a perfect match. the order has to be at the right of what you're matching.
[17:36:12] <kali> mmlac-bv: a few number of fields (if they're not highly selective) can creep to the left of the order
[17:36:16] <mmlac-bv> Yeah I'm trying to wrap my heasd around it right now… :D
[17:36:37] <kali> mmlac-bv: sorry, no, they need to be right of the order :)
[17:36:51] <mmlac-bv> So basically skipping a field does not work, correct? I.e. a -> b -> c does not work for a -> c ?
[17:37:05] <mmlac-bv> sort by c in this case
[17:37:14] <kali> no, that's not owrk
[17:37:22] <mmlac-bv> *sigh* thanks
[17:37:25] <kali> a,c,b, may work depending on cardinalities
[17:43:58] <huahax> hey Kali, would u mind looking at my stackoverflow question? :)
[17:44:14] <huahax> if it's totally hopeless, i'll redesign my schema...
[17:44:48] <kali> huahax: give me a link
[17:44:53] <huahax> http://stackoverflow.com/questions/17554788/mongodb-updating-dynamic-nested-arrays
[17:45:30] <kali> huahax: nothing to add
[17:45:55] <huahax> u think there's a solution?
[17:46:01] <huahax> if not i dont want to waste more time on it
[17:46:13] <kali> huahax: redesign. make the post the top level object
[17:47:11] <huahax> kali: ok. isnt it already top level?
[17:47:43] <kali> huahax: i don't read mongoose
[17:47:49] <kali> huahax: so i'm just guessing
[17:47:57] <huahax> btw, i have more entries in the post array, like "time added" etc, so i need it as an array
[17:48:03] <huahax> okey
[17:48:24] <huahax> kali: i'll redesign, thx
[17:48:43] <kali> huahax: i think you'd better dump down the question and write them in mongo shell to get more people to help you
[17:48:55] <kali> dumb down
[17:49:06] <huahax> dumb down = delete? =p
[17:49:15] <huahax> u mean tag mongoshell?
[17:49:18] <kali> huahax: naaa :) just get mongoose out of the way
[17:49:22] <huahax> ah
[17:49:27] <kali> write them in plain js, in the mongoshell
[17:49:32] <huahax> it's very similar
[17:49:36] <huahax> if not exactly the same
[17:49:50] <kali> "Schema" is not mongodb :P
[17:51:03] <huahax> oh..
[17:51:14] <kali> everybody will understand an example document in js
[17:51:20] <huahax> but dont know how to write it so it's just mongoshell
[17:51:49] <Dark_Sun> hi there
[17:52:03] <Dark_Sun> "errmsg" : "still syncing, not yet to minValid optime 51dc40a8:49b"
[17:52:16] <Dark_Sun> what's the exact meaning of this ?
[18:12:31] <TheDeveloper___> why does update() 'ing $inc counters on a single node work, but as soon as I switch on sharding it drops most of the writes
[19:34:16] <AndrewD> Hey, I'm trying to use pymongo to create unique IDs for some dictionaries, but when I call " bson.objectid.ObjectId " it always returns the same value. Any ideas?
[19:48:44] <AndrewD> Ah, nevermind, solved it.
[20:01:11] <oogabubchub> Anyone here?
[20:01:44] <Zelest> mhm
[20:04:28] <oogabubchub> Anyone know which performs better: compound indeces or embedded doc indeces? It would be on ID's stored in an array vs. an embedded doc for each ID. Typical circumstance is 1 or 2 ID's over millions of docs
[20:04:44] <oogabubchub> On each doc, that is
[20:05:43] <solars> hi there, the compact docs say: but unlike repairDatabase it does not free space on the file system. - but other sites say that it compacts the db and frees disk space - whats true now?