pmxbot IRC Log Viewer

[02:57:44] <pasky_> Hi! Is there an easy way to make a newly inserted object have an _id that is a plain string instead of ObjectId? (Unfortunately, having it ObjectId seems to produce endless pain in conjunction with Meteor.)

[02:57:56] <pasky_> (I'm using pymongo to insert objects.)

[03:05:40] <pasky_> ok, adding '_id': str(bson.objectid.ObjectId()) to the inserted object seems to do the trick \o/

[03:34:49] <sdsheeks> evening

[04:09:52] <sag> hi all

[04:10:16] <sag> can anyone tell me which is the ideal method to integrate mongo with php

[04:10:27] <sag> i mean php-mongo driver or mongo REST api

[04:11:00] <sag> i am using php-mongo driver in production from Oct-2012

[04:11:41] <sag> it tooks time when i am firing say 500 commands in for loop

[04:12:13] <sag> using multi_curl i can fire some parallel HTTP calls to mongo REST

[04:30:12] <sag> anybody

[04:31:22] <sag> can anyone tell me which is the ideal method to integrate mongo with php

[04:31:37] <sag> i mean php-mongo driver or mongo REST api

[07:37:27] <[AD]Turbo> hola

[07:53:12] <TraceRoute> hey all, I'm looking at using mongodb to add redundancy to my infrastructure. Could someone confirm this is possible. I have servers at different locations and a parent mongodb which acts as the master. The servers at different locations should only have data which relates to their location. Is there some way I can use replication to only pull down records that relate to the location and also push local changes back to the master?

[07:56:36] <sag> i think that u have to manage in schema

[07:56:48] <sag> like geo specific collections..

[07:57:17] <sag> i mean u have to define collections in that way..

[07:59:29] <TraceRoute> ideally thats how the root schema would start…a location has a bunch of objects inside. I only want the one server to have replication of 1 root object

[07:59:32] <sag> i dont think this can be possible with replica sets

[08:00:01] <sag> u may have to do master slave

[08:00:14] <sag> no sorry..

[08:00:30] <sag> not even that..u will have to use shards..

[08:00:44] <sag> sharding on geo tag

[08:01:20] <sag> all app should pass their country's geo tag..

[08:01:50] <shmoon> hi

[08:02:01] <sag> hi

[08:02:17] <shmoon> i do db.coll.find()... and also db.coll.find({..}) , assign it to a variable, i need to access _id now n the mongo console but i am failing

[08:02:36] <shmoon> basically i want to get the date and time for each collection from find() off their _id (since i think _id has timestamp)

[08:02:43] <TraceRoute> thx sag, I'll look into it more

[08:03:33] <sag> @TraceRoute : why u need primary mongo..as u already have some store..

[08:03:50] <sag> @TraceRoute and u just want redundancy

[08:04:10] <shmoon> ObjectId("507c7f79bcf86cd7994f6c0e").getTimestamp()

[08:04:13] <shmoon> oops sorry for that

[08:04:16] <Derick> sag: it's "you" and it's also customary on IRC to refer to people with "TraceRoute:" and not "@TraceRoute"

[08:05:38] <sag> "Derick:" ok sir..i am new to channel didn't knew the convention..will keep that in mind

[08:06:14] <Zelest> I've actually thought about that Derick ..

[08:06:27] <Zelest> making a IRC-gui kind of.. sort of a client.. but with a somewhat twitter-interface..

[08:06:37] <Zelest> like, all channels and such in the same window

[08:06:39] <Derick> Zelest: gah, people can just learn things.

[08:06:50] <Zelest> but you use hashtags (channels) to communicate :D

[08:06:56] <Zelest> or.. the other way around..

[08:07:01] <Zelest> a twitter-interface like a IRC client :D

[08:08:19] <Zelest> Derick, so, what's popping.. anything new I've missed? :D

[08:08:46] <Derick> Zelest: dunno, I've been on holiday

[08:09:21] <Zelest> same, hence the question :P

[08:12:50] <shmoon> anyone?

[08:13:24] <sag> shmoon: u need all ids from all coll? or all ids from some coll?

[08:14:45] <shmoon> sag: lets do 1 by 1. lets say i do this db.coll.find({table_id:5}).limit() now ow do I get the _id from it and maybe store ina variable in the console?

[08:18:33] <sag> shmoon: fond always returns a document

[08:18:38] <sag> *find

[08:19:24] <shmoon> look at this sag http://pastie.org/8123793

[08:19:26] <shmoon> nothign

[08:21:20] <sag> shmoon: so u will always get whole document unless specified as, db.coll.find({table_id:5},{title:1}) , will only return id and title..

[08:21:40] <shmoon> hm

[08:21:42] <shmoon> it returns cursor

[08:21:50] <shmoon> so i have to do doc.next() it seems

[08:23:31] <sag> shmoon: check ur pastie, i have replied there..let me know if it works

[08:23:50] <sag> i mean http://pastie.org/8123793

[08:59:31] <shmoon> sag: you cannot reply on pasties i think

[08:59:40] <shmoon> no commenting system, no edit on the same url, nothing

[08:59:45] <shmoon> it contains what i wrote

[08:59:49] <shmoon> or maybe i am missing somethign

[09:14:13] <untaken> Anyone familar with the CPAN modules for MongoDB? What is the most efficient way to setup paging? When I check the cursor after with has_next, it seems to have pulled all the rows from the collection. There must be a more efficient way, when I set limit and skip? Maybe it's just the perl module, but was hoping someone may know around here?

[09:15:36] <Derick> untaken: that doesn't seem related to just the perl driver

[09:17:12] <untaken> Derick: I know, but wasn't sure where to look next... really was after some pointers :)

[09:17:23] <untaken> bit of channel attack there :/

[09:17:36] <Derick> seems odn is a bit flakey at the moment

[09:17:49] <untaken> odn?

[09:17:54] <Derick> freenode

[09:17:57] <untaken> ah yea

[09:17:58] <Derick> this IRC network

[09:20:00] <[AD]Turbo> Sorry, I must post my question again (split sucks)

[09:20:03] <[AD]Turbo> I have a collection (Items) with a 2dsphere index on a 'pos' field (db.Items.getIndexes() returns me "key" : { "pos" : "2dsphere" }, "name" : "pos_2dsphere") but when I query that collection db.Items.find({ 'pos': { $geoWithin: { $box: [ [0, 0], [50, 50] ] } } }).explain() I see that a "BasicCursor" is chosen instead of the index. Is there a reason?

[09:20:42] <Derick> [AD]Turbo: geowithin with box wants a 2d index

[09:21:07] <Derick> "$box" is a flatland feature, not a spherical one

[09:21:16] <[AD]Turbo> ah, but I have need to query that collection with $nearSphere too

[09:21:45] <Derick> [AD]Turbo: try geowithin + geojson where you construct the polygon yourself

[09:22:34] <[AD]Turbo> does the "2d" index support $nearSphere and geoWithin+box atthe same time?

[09:22:55] <[AD]Turbo> if so, i can use a 2d (not a 2dspere)

[09:23:19] <perplexa> http://pastebin.com/2rrzaXrR < can anybody please explain why i get this error? :(

[09:23:39] <Derick> [AD]Turbo: no

[09:23:46] <Derick> [AD]Turbo: use a 2dsphere, it's much faster too

[09:24:10] <Derick> perplexa: which MongoDB version do you use? You need atleast 2.2

[09:24:21] <perplexa> 2.5

[09:24:27] <Derick> uh?

[09:24:35] <perplexa> MongoDB shell version: 2.5.1-pre-

[09:24:39] <perplexa> server is 2.4

[09:24:50] <Derick> that is odd

[09:24:56] <perplexa> yeah ;/

[09:25:28] <Derick> i run 2.4.3 shell with 2.5.1-pre-

[09:25:37] <Derick> > db.auctions.aggregate( { $match : { 'bId': 893634 } } );

[09:25:37] <Derick> { "result" : [ ], "ok" : 1 }

[09:25:41] <Derick> works fine...

[09:25:48] <Derick> what does db.version() output?

[09:26:04] <perplexa> oh!

[09:26:04] <perplexa> > db.version()

[09:26:05] <perplexa> 2.0.4

[09:26:09] <perplexa> brb, slapping somebody

[09:26:11] <perplexa> :)

[09:26:13] <Derick> :D

[09:36:04] <shmoon> i have a question

[09:36:24] <Nodex> amazing

[09:38:26] <shmoon> http://pastie.org/8123977 - I need the k12 value for table_rows.reader = 'total no of children', how do i go about doing it

[09:39:43] <Nodex> in one document?

[09:39:55] <shmoon> ya this is 1 docment

[09:40:06] <shmoon> hm yeah get it as 1 document that would help

[09:40:32] <Nodex> loop through the array in your script, there is no mongo specific way to do this other than a map/reduce

[09:40:47] <shmoon> :(

[09:40:59] <Nodex> why are your integers cast as a string

[09:41:00] <Nodex> ?

[09:41:36] <shmoon> this data is coming from an html table with fields, i am not sure how i can cast proper int as int and rest as string

[09:41:40] <shmoon> i will show you an image of the UI wait

[09:42:21] <shmoon> Nodex: http://puu.sh/3yxXi.png

[09:42:23] <Nodex> you're telling me you blindly put data in your database without sanitising it?

[09:42:44] <shmoon> help me understand how to sanitize this then

[09:42:53] <Nodex> that's outside the scope of this channel

[09:43:08] <shmoon> i think just saving it as it is and htmlentities while printing sould be good enough

[09:44:14] <Nodex> ok, good luck :)

[09:45:04] <shmoon> please tell me if that has a problem?

[09:45:28] <Nodex> again, outside the scope of this channel

[09:46:42] <shmoon> :(

[09:49:07] <Nodex> application / database security and XSS / CSRF has nothing to do with MongoDB

[09:52:54] <Nodex> :P

[09:54:13] <moogway> hi, i am trying to use mongodb with python and wondering about ORMs... I prefer to use pymongo over anything else but it doesn't support declaring the model and I'm not sure how to tell mongo to create/ensure an index on a collection using my app. Do I have to create the db and collections manually using mongo shell?

[09:54:32] <shmoon> dude

[09:54:38] <shmoon> there must be some way to achieve what i want to

[09:54:40] <shmoon> rather than looping

[09:54:42] <shmoon> in app layer

[09:54:50] <shmoon> some easy query

[09:55:18] <Nodex> there

[09:55:21] <Nodex> isn't

[09:55:22] <Nodex> another

[09:55:24] <Nodex> way

[09:55:25] <Nodex> like

[09:55:26] <Nodex> in

[09:55:27] <Nodex> the

[09:55:29] <Nodex> database

[09:55:35] <shmoon> hm

[09:55:51] <shmoon> cant beleve this

[09:57:21] <shmoon> is this a planned feature, ownder if you know or not

[09:57:56] <moogway> I am using db.test.create_index([("FieldName", pymongo.DESCENDING)]) but isn't that going to create an index every time an instance of the app is called?

[09:58:16] <Derick> moogway: it's going to try - is there an "ensure_index" perhaps?

[09:58:29] <moogway> yep, there is

[09:58:50] <Derick> that's a cached version of create_index

[09:58:56] <moogway> but isn't ensure_index a sort of superset of create index?

[09:59:16] <Derick> well, it basically checks if the index exists (also in local cache) before creating one

[09:59:17] <moogway> okay, got it

[09:59:33] <Zelest> Derick, is there a way of ensuring a TTL index through PHP?

[09:59:37] <Derick> Zelest: yes

[10:00:01] <Zelest> Derick, like, I want to store my sessions in mongodb and use a ttl collection for it.. but I don't want to reindex it every page hit :P

[10:00:30] <moogway> thanks Derick, I like mongoengine because it is declarative but prefer to use pymongo for the sake of avoiding third party libraries

[10:00:35] <Derick> Zelest: sorry, ttl is on a collection, isn't it?

[10:01:11] <Zelest> Derick, huh? yeah?

[10:01:15] <Derick> hmm, no

[10:01:17] <Derick> that's capped, sorry

[10:01:28] <Zelest> ohh

[10:01:32] <Zelest> you meant like that.. yeah

[10:01:38] <Zelest> yeah, ttl is on indexes

[10:01:56] <Derick> Use the expireAfterSeconds option to the ensureIndex method in conjunction with a TTL value in seconds to create an expiring collection

[10:02:19] <Nodex> Zelest : use dbCommand() iirc

[10:02:21] <Zelest> Derick, but can I fire that on every page load?

[10:02:23] <Derick> http://php.net/manual/en/mongocollection.ensureindex.php

[10:02:25] <Derick> you can

[10:02:27] <Derick> sure

[10:02:35] <Derick> it's even documented

[10:02:42] <Zelest> oh

[10:02:45] <Zelest> sorry for asking then :)

[10:02:52] <Zelest> oh btw!

[10:02:57] <Zelest> very off-topic..

[10:03:16] <Zelest> but today it's 1 year without energy drinks! :D

[10:03:20] <Derick> haha

[10:03:34] <Zelest> yet I'm speeded like a freak :D

[10:04:16] <[AD]Turbo> how to remove the limit of 100 documents returned by the $nearSphere, is it possible? my nearSphere queries all have a $maxDistance parameter but Im'm not ensured that results would be less than 100 docs

[10:05:04] <Derick> [AD]Turbo: you can add $limit: 500 f.e.

[10:05:09] <Derick> let me double check that

[10:06:38] <[AD]Turbo> from official documentation only geonear command has an additional 'limit' / 'num' parameter support, anyway I don't see the reason of such limitation (usually .find returns all documents)

[10:06:54] <Derick> [AD]Turbo: ah, sorry

[10:06:59] <Derick> you can just use limit()

[10:07:22] <Derick> I'd suggest you use $near with a 2dsphere index though

[10:07:26] <Derick> http://docs.mongodb.org/manual/reference/operator/near/

[10:07:39] <Derick> ok

[10:07:44] <Derick> let me shut up and try it first

[10:10:20] <[AD]Turbo> I can't really understand, from a developer point of view, introduce such 'limit' limitation for geo queries (and not present for standard non-geo queries)

[10:11:17] <Derick> [AD]Turbo: because it's not a fast operation to have it unbound

[10:12:12] <[AD]Turbo> the same approach for standard queries could not be proposed for such geo-queries? memory pools?

[10:13:06] <[AD]Turbo> if I set limit to, for example, 100000000, does the mongodb allocates some memory of such amount or so?

[10:13:17] <Derick> uh, no

[10:13:22] <Derick> it's not a memory thing

[10:13:37] <Zelest> Derick, yay, the index part worked lovely. thanks :)

[10:13:47] <Derick> an unbound nearSphere/near needs to calculate the distance for *all* the points in the database if you don't have a maxDistance

[10:13:51] <[AD]Turbo> oh, so what if no memory related?

[10:14:02] <Derick> if you have 500, that's fine

[10:14:08] <Derick> but if you have 5 million, that's not find

[10:14:13] <[AD]Turbo> but if i have a maxDistance?

[10:14:13] <Derick> fine*

[10:14:31] <Derick> in that case, it still limits you to be consistent I suppose.

[10:15:25] <Derick> I can't see how limit can be used though... to incrase it

[10:16:36] <[AD]Turbo> but, maxDistance=someValue + limit = 10000000000 is a similar situation of not to have a maxDistance and have an unbound query

[10:17:04] <[AD]Turbo> if I expect to have thousands results

[10:17:05] <Derick> yeah, but limit is theoretical here

[10:18:25] <Derick> [AD]Turbo: $geoNear in aggregation supports limit

[10:18:37] <Derick> I'd suggest you use that instead until geoNear uses a real cursor

[10:18:39] <Derick> http://docs.mongodb.org/manual/reference/aggregation/geoNear/

[10:19:30] <[AD]Turbo> so geoNear would be the best option from a practical perspective?

[10:19:38] <Derick> yeah

[10:20:47] <[AD]Turbo> even without aggregration, I suppose

[10:22:37] <perplexa> sigh... what other arguments are there for aggregate over mapreduce other than reducing complexity and processing time? :|

[10:23:07] <perplexa> the dude responsible for upgrading claims 'blabla stable debian is more important'...

[10:23:20] <perplexa> such a harsh debate ;/

[10:23:55] <Nodex> is that a question?

[10:24:08] <perplexa> yeah, i need to convince him

[10:24:24] <perplexa> his only argument is that there's no stable deb packet in wheezy..

[10:25:53] <SomeoneWeird> what

[10:25:57] <SomeoneWeird> slap him

[10:26:15] <perplexa> i can't slap the cto... but his argument is a fucking joke

[10:26:16] <perplexa> ;p

[10:26:31] <SomeoneWeird> ah.. yeah that may be a bad idea.

[10:27:41] <Derick> perplexa: 10gen has its own apt repositories

[10:27:55] <Derick> running 2.0 is .. well, silly

[10:28:20] <Derick> running old versions in general is silly

[10:28:41] <kali> aha, a debien purist

[10:28:45] <kali> good luck with that

[10:30:56] <shmoon> i want to find all the documents where a particular field is existing/set ?

[10:30:58] <shmoon> possible?

[10:31:07] <Derick> perplexa: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-debian-or-ubuntu-linux/

[10:31:10] <Derick> shmoon: yes

[10:31:19] <Derick> shmoon: https://www.google.co.uk/search?q=mongodb%20exists

[10:36:51] <shmoon> mthanks

[10:38:42] <perplexa> Derick: yeah i've shown him that

[10:39:04] <perplexa> he's afraid of the 16mb pipeline output limit of the aggregation stuff now ;p

[10:39:15] <Derick> o_O

[10:41:01] <perplexa> what does 'output from the pipeline' exactly mean?

[10:41:14] <perplexa> is it the result?

[10:41:38] <perplexa> or the piped data between the pipeline operators?

[10:41:48] <perplexa> but that wouldn;t make sense

[10:41:52] <perplexa> i guess

[10:41:54] <kali> the 16MB applies to all intermediary states, but the pipeline is optimized

[10:42:25] <perplexa> kali gets the award for confusing me even more ;p

[10:42:34] <perplexa> i'm a newb :)

[10:42:44] <perplexa> may you elaborate, please?

[10:42:48] <kali> sorry, but real life is complicated :)

[10:43:00] <Derick> kali: are you sure it's for all states?

[10:43:12] <Derick> I've however never run into this limitation

[10:43:21] <kali> Derick: i have, plenty of times

[10:43:24] <Derick> 16mb is quite a lot!

[10:43:28] <Derick> kali: ok :-)

[10:44:05] <kali> to simplify: if you do a $sort at some point, the input needs to be stored in RAM completely before the $sort can actually sort

[10:44:21] <perplexa> 16mb seems not so much,considering my sql queries return 100mio rows sometimes ;/

[10:44:22] <kali> the input = the input of the $sort

[10:44:41] <Derick> kali: right, but not necessarily for match or project I thought

[10:44:48] <kali> Derick: that's right.

[10:44:57] <Derick> it's only sort and group that require it IIRC

[10:44:59] <sag> i guess this has changed in 2.4.5 as per release notes

[10:45:25] <kali> but then, the optimisations kicks in: of you do $sort followed by $limit, you only need to keep in RAM the size of the limit

[10:45:34] <Derick> yup

[10:45:35] <kali> Derick: yep

[10:46:29] <perplexa> > db.version()

[10:46:30] <perplexa> 2.4.5

[10:46:33] <perplexa> finally. :P

[10:46:34] <Derick> yay

[10:47:15] <sag> kali: as per release note "he MongoDB can perform a more efficient sort that does not require keeping the entire result set in memory"

[10:47:35] <sag> kali:ohh i missed your replies

[10:48:03] <sag> its seems i am on same page :)

[11:11:20] <huahax> hi

[11:11:21] <huahax> http://stackoverflow.com/questions/17546710/how-to-push-in-nested-array-in-mongoose

[11:51:58] <perplexa> rofl, turn on music, chill with headphones on, suddenly everybody starts looking into my direction, realise speakers are at max volume and headphones unplugged :D

[11:52:12] <Zelest> haha

[11:54:09] <borior> hi all, so I'm having trouble configuring a replicaset when the nodes are portmapped (NATed) behind another IP address

[11:54:45] <borior> specifically, in my argument to rs.initiate, I'm specifying the hosts with their "external" IP addresses: 192.168.X.X

[11:54:53] <huahax> ey, someone want to help me with some basic queries in nested arrays? :)

[11:55:42] <borior> then the node that's running initiate runs getMyAddrs(), which returns a list that doesn't include that address, and the replSet initialization bombs out with: "couldn't initiate : can't find self in the replset config"

[11:56:17] <borior> is there any way around this? why on earth does mongo need to know the IP on which it is exposed the rest of the world?

[12:15:53] <perplexa> kali: so, when i have db.auctions.aggregate( { $match : { bId: 893634 } } ) which yields a result >16MB, will it produce an error?

[12:16:09] <kali> yes

[12:16:17] <perplexa> that's... shit

[12:17:06] <kali> perplexa: well, all tools have an application domain

[12:37:12] <borior> so if my interpretation of this is correct, it's basically impossible to run a replicaset in which members refer to one another using NATed IP addresses...

[12:37:15] <borior> yay

[13:17:29] <dahankzter> Is it possible to have a more complex GeoJSON structure, say each point has time data associated

[13:17:57] <dahankzter> or is it locked to the Type/coordarray structure?

[13:59:45] <bdiu> Anyone interested in a full time gig in Bloomington, In w/ paid relocation? :-)

[14:10:32] <dahankzter> Is it possible to have a more complex GeoJSON structure, say each point has time data associated

[14:11:01] <dahankzter> Noone knows? The online GeoJSON parsers out there give ambigous info

[14:37:03] <richthegeek> I have an array of values [1,2], and I want to find all rows which contain any of those values in their own array... so a row might be {_id: 42, vals: [2, 3]}

[14:37:17] <richthegeek> is {vals: {$in: [1,2]}} the right query?

[14:40:34] <rspijker> richthegeek: from the documentation: If the field holds an array, then the $in operator selects the documents whose field holds an array that contains at least one element that matches a value in the specified array (e.g. <value1>, <value2>, etc.)

[14:41:07] <richthegeek> rspijker: yeah, it's working now - I was using $not instead of $ne !

[14:41:11] <richthegeek> thanks though

[14:41:25] <rspijker> sure :)

[15:33:07] <rodrigofelix> hi, where is persisted the replica set configuration?

[15:35:00] <harenson> rodrigofelix: in the "local" db

[15:35:04] <rspijker> local.system.replset

[15:35:07] <rspijker> if memory servers

[15:35:27] <rspijker> serves...

[15:36:12] <rodrigofelix> to reset the rs config the only thing I have to do is delete local db?

[15:37:04] <harenson> rodrigofelix: look at here http://docs.mongodb.org/manual/reference/replica-configuration/

[15:38:15] <rodrigofelix> ok, I did

[15:38:53] <rodrigofelix> what is the default replication factor of mongodb/

[15:40:37] <EmmEight> Hello

[15:40:39] <redsand> rodrigofelix: in a replica set i believe it is all

[15:40:54] <redsand> and then shards replicate within themselves but not across shards

[15:41:50] <rspijker> what do you mean when you say replication factor?

[15:43:08] <rspijker> because have only ever seen that term used in conjunction with WriteConcern

[15:43:46] <redsand> rspijker: good point

[15:44:21] <rodrigofelix> I mean how many replicas have the same data

[15:44:40] <rspijker> all of them do

[15:45:17] <zymogens> Am a complete noob to mongodb. Have a quick question. I have a few object arrays inside a mongodb doc. It seems that mongodb assigns an object_id to all of them. I realise every doc needs an object_id, but does every object inside a doc also need one? Thanks.

[15:45:20] <rodrigofelix> ok, so all replicas have the records, right?

[15:45:30] <rodrigofelix> * the same

[15:45:54] <rspijker> rodrigofelix: at any given time, not necessarily, due to replication lag. But on an idle system, yes

[15:46:06] <rodrigofelix> yeap, understood

[15:46:46] <rodrigofelix> is this configurable? could I say that I want to have, for instance, only two replicas storing a specific record, even if my cluster has 10 nodes?

[15:47:45] <rodrigofelix> I'm trying to have a similar env to compare mongodb with cassandra

[15:48:06] <rspijker> rodrigofelix: afaik you can't. But I am not sure

[15:48:43] <rodrigofelix> ok. I'll try to figure out if I can change cassandra to work like mongo, replicating all data in all nodes

[15:50:58] <rspijker> zymogens: it does not.

[15:51:30] <zymogens> rspijker: oh ok… thanks … is it easy to turn off

[15:51:42] <nickmbailey> rodrigofelix: if you try to make cassandra work like mongo it will not perform very well

[15:52:05] <rodrigofelix> :/ .. why nickmbailey?

[15:52:32] <rspijker> zymogens: you shouldn't have to do anything… If you insert a document into mongo and you don't include an _id, mongo will create it for you. For embedded documents this is simply not the case...

[15:52:45] <nickmbailey> rodrigofelix: because it isn't mongo

[15:54:31] <rodrigofelix> :) .. ok, but what is the aspect of cassandra architecture (or strategy) that you are considering to say that cassandra does not perform well when all nodes have that same data?

[15:54:34] <zymogens> rspijker: thanks. there seems to have been an object_id for every object I have in each document. Not sure how they got there… Will look into it some more.

[15:55:17] <rodrigofelix> there are some points that I need to align when benchmarking cassandra and mongodb and I think that replication factor is one of them

[15:55:23] <nickmbailey> well it would probably do fine if you only have 3 nodes (3 is a very common replication factor) but if you are trying to apply that generally (with N nodes) then you are missing the point of cassandra

[15:55:42] <rodrigofelix> ok, understood

[15:56:23] <rodrigofelix> I think about varying from 3 to 5 nodes in my experiments, both on cassandra and mongodb

[15:56:36] <rspijker> zymogens: you can easily test it. db.Collection.insert({}) will create a document with an _id. db.Collection.insert({"c":{}}) will create a document with an _id and an empty subdocument "c" inside.

[15:56:44] <rodrigofelix> I'll try to run both with their default config and gather some results

[15:57:10] <rspijker> why do you want such a large number of replicated nodes if you don't mind me asking?

[15:59:15] <zymogens> rspijker: Is that a question for me?

[15:59:30] <rspijker> no, for rodrigofelix

[15:59:47] <rodrigofelix> well, I could have less replicated nodes. my main concern is trying to have a fair comparison among cassandra and mongodb, trying to have similar config although I know this it not that easy, since they have many different strategies

[16:00:33] <rodrigofelix> my idea was changing cassandra, because I can't see (for now) how to change replication factor of mongodb

[16:01:37] <rspijker> if you want to compare them, then you shouldn't look at having the exact same amount of nodes and replication in both cases (imo). You should look at a similar setup in terms of failover...

[16:01:42] <rodrigofelix> but I believe the best I can do is to benchmark with default configs of both and then compare how elastic they are according to some metrics I'm defining

[16:01:59] <zymogens> rspijker: Just tried it out there… Seems to not need an object_id… Thanks

[16:02:01] <rspijker> as in, if you have 9 cassandra nodes with a replication factor of 3 then you could have 3 mongo shards with 3 replica set members

[16:02:50] <rspijker> anyway, I got to go. Good luck :)

[16:03:10] <rodrigofelix> I understood your point. I'm gonna think about it

[16:03:16] <rodrigofelix> it makes sense

[16:03:24] <rodrigofelix> thanks :)

[16:04:15] <michael_____> hey, in the current mongod version, is safe=true per default activated (for unique indexes)?

[16:09:39] <dahankzter> Is it possible to do something like this {"legs":1,"legs":{"$slice":1} to to get only the first element of the array "legs"?

[16:11:18] <dahankzter> The other way seems to work {"legs":{"$slice":1},"legs":1} :)

[16:29:16] <ScottBPX> Is this support for mongoDB?

[16:30:39] <ScottBPX> Hello

[16:41:10] <huahax> hi

[16:41:29] <huahax> anyone have experience with nested arrays?

[16:55:39] <zymogens> Hi, Am using Mongoose. Object_IDs are being automatically generated for nested objects in a doc before I do a save() … Any idea how I'd prevent them being generated?

[16:58:12] <huahax> anyone have experience with nested arrays?

[17:06:18] <huahax> is it possible to push to an array in another array..?

[17:08:09] <mmlac-bv> Can a compound index index a -> b -> c -> d replace having another index a -> b -> d?

[17:09:52] <mmlac-bv> And does the order matter to a query? Or is a index a -> b -> c as good as c -> b -> a if the query is i.e. find id: a where b=1 and order by c?

[17:13:56] <huahax> no one here :<

[17:28:49] <huahax> 'posts.$.comments.0.text': "a comment"

[17:28:57] <huahax> anyone have a clue to get this dynamic? :)

[17:29:18] <huahax> i use $set

[17:30:04] <kali> huahax: yeah, we're shying away from double-nested arrays :)

[17:30:13] <huahax> haha

[17:30:17] <huahax> i've googled my ass off

[17:30:28] <huahax> it seems that the $ operator doesnt reach that far :'(

[17:30:46] <kali> mmlac-bv: it DOES matter.

[17:31:07] <huahax> u skilled with mongo kali?

[17:31:47] <kali> huahax: i guess i am

[17:31:51] <mmlac-bv> kali performance-wise or "mongo does no longer use this index" wise?

[17:32:11] <huahax> nice

[17:32:17] <kali> mmlac-bv: both

[17:32:38] <mmlac-bv> hm, ok, I can see the range at the front being bad

[17:32:49] <mmlac-bv> i.e. ail not use the index

[17:32:52] <mmlac-bv> will*

[17:32:54] <huahax> would u say it is practically impossible at the moment to make a nested array dynamic?

[17:32:59] <kali> mmlac-bv: if you want the index to be efficient, you need mongodb to be able to anwser the query without making too many random accesses

[17:33:17] <mmlac-bv> well the issue right now is the indexes are so massive they won't even fit into memory

[17:33:21] <mmlac-bv> so I gotta weed out

[17:33:27] <kali> mmlac-bv: waw, this is bad :)

[17:33:30] <mmlac-bv> yep

[17:34:28] <mmlac-bv> thanks for your help

[17:34:49] <kali> huahax: yeah... it's gonna be tricky

[17:35:00] <huahax> okey :<

[17:35:26] <huahax> i guess i'll redesign my schema then

[17:35:44] <kali> mmlac-bv: you don't need a perfect match. the order has to be at the right of what you're matching.

[17:36:12] <kali> mmlac-bv: a few number of fields (if they're not highly selective) can creep to the left of the order

[17:36:16] <mmlac-bv> Yeah I'm trying to wrap my heasd around it right now… :D

[17:36:37] <kali> mmlac-bv: sorry, no, they need to be right of the order :)

[17:36:51] <mmlac-bv> So basically skipping a field does not work, correct? I.e. a -> b -> c does not work for a -> c ?

[17:37:05] <mmlac-bv> sort by c in this case

[17:37:14] <kali> no, that's not owrk

[17:37:22] <mmlac-bv> *sigh* thanks

[17:37:25] <kali> a,c,b, may work depending on cardinalities

[17:43:58] <huahax> hey Kali, would u mind looking at my stackoverflow question? :)

[17:44:14] <huahax> if it's totally hopeless, i'll redesign my schema...

[17:44:48] <kali> huahax: give me a link

[17:44:53] <huahax> http://stackoverflow.com/questions/17554788/mongodb-updating-dynamic-nested-arrays

[17:45:30] <kali> huahax: nothing to add

[17:45:55] <huahax> u think there's a solution?

[17:46:01] <huahax> if not i dont want to waste more time on it

[17:46:13] <kali> huahax: redesign. make the post the top level object

[17:47:11] <huahax> kali: ok. isnt it already top level?

[17:47:43] <kali> huahax: i don't read mongoose

[17:47:49] <kali> huahax: so i'm just guessing

[17:47:57] <huahax> btw, i have more entries in the post array, like "time added" etc, so i need it as an array

[17:48:03] <huahax> okey

[17:48:24] <huahax> kali: i'll redesign, thx

[17:48:43] <kali> huahax: i think you'd better dump down the question and write them in mongo shell to get more people to help you

[17:48:55] <kali> dumb down

[17:49:06] <huahax> dumb down = delete? =p

[17:49:15] <huahax> u mean tag mongoshell?

[17:49:18] <kali> huahax: naaa :) just get mongoose out of the way

[17:49:22] <huahax> ah

[17:49:27] <kali> write them in plain js, in the mongoshell

[17:49:32] <huahax> it's very similar

[17:49:36] <huahax> if not exactly the same

[17:49:50] <kali> "Schema" is not mongodb :P

[17:51:03] <huahax> oh..

[17:51:14] <kali> everybody will understand an example document in js

[17:51:20] <huahax> but dont know how to write it so it's just mongoshell

[17:51:49] <Dark_Sun> hi there

[17:52:03] <Dark_Sun> "errmsg" : "still syncing, not yet to minValid optime 51dc40a8:49b"

[17:52:16] <Dark_Sun> what's the exact meaning of this ?

[18:12:31] <TheDeveloper___> why does update() 'ing $inc counters on a single node work, but as soon as I switch on sharding it drops most of the writes

[19:34:16] <AndrewD> Hey, I'm trying to use pymongo to create unique IDs for some dictionaries, but when I call " bson.objectid.ObjectId " it always returns the same value. Any ideas?

[19:48:44] <AndrewD> Ah, nevermind, solved it.

[20:01:11] <oogabubchub> Anyone here?

[20:01:44] <Zelest> mhm

[20:04:28] <oogabubchub> Anyone know which performs better: compound indeces or embedded doc indeces? It would be on ID's stored in an array vs. an embedded doc for each ID. Typical circumstance is 1 or 2 ID's over millions of docs

[20:04:44] <oogabubchub> On each doc, that is

[20:05:43] <solars> hi there, the compact docs say: but unlike repairDatabase it does not free space on the file system. - but other sites say that it compacts the db and frees disk space - whats true now?

Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 9th of July, 2013