PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 18th of April, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:07:52] <simpleAJ> hi.. I was looking at db.createCollection API. It has size filed. Suppose my collection is not a capped collection and I didn't specify size field..how much does mongo db allocated a space for that collection?
[01:21:41] <Jadenn> does this look okay: db.friends.find( { $or : [ { $and : [ { friendUid : <id> }, { confirmed : 1 } ] } , { uid : <id> } ] } ,{uid:1,username:1,friendUid:1,friendUsername:1,confirmed:1,date:1} )
[01:24:57] <Jadenn> mongo does not like paranthesized queries :S
[01:30:59] <joannac> why do you need $and?
[01:31:29] <Jadenn> good question :P
[01:31:40] <Jadenn> i was just using an sql to mongo converter
[01:36:30] <Jadenn> ah, that fixed it, thanks
[01:37:53] <Frozenlock> Is there a function like $near, but for timestamps?
[01:38:03] <cheeser> no
[01:38:13] <Frozenlock> :'-(
[01:54:43] <jadeLA222> Question: doing a nested lookup for the first time in RoR/Mongoid, and not sure what I’m doing wrong. Anyone spot something wrong w/ this syntax?
[01:54:49] <jadeLA222> Order.find( {'line_items.sku' : {'$in' : ["855906004184", "855906004245"]} } )
[02:05:57] <sx> would there be much performance difference for storing 1000 properties in one document vs 1000 documents w 1 property each
[02:08:43] <Frozenlock> Any suggestion on how to get $near functionality for timestamps? My crude approach would probably be using $lte and $gte, sort them and take the closest to my timestamp.
[02:12:34] <joannac> sx: if the 1000 properties are logically connected, what would be the point of separating them?
[02:13:18] <joannac> Frozenlock: sure I guess
[03:15:11] <in_deep_thought> can someone explain the objectID to me? I have objects in my database that are autogenerated like this: db.images.insert({ "source" : "www.easyvn.net--45-best-real-world-full-hd-wallpapers--007.jpg", "_id" : ObjectId("534de25fce66b4cc1ab8fbe5"), "score" : 0, "__v" : 0 } the ObjectID filed throws everything off because I can't save add it to other dbs because its not valid json. then when I manually strip off the object id I get error
[03:15:11] <in_deep_thought> when doing findbyID. what is the deal?
[03:17:06] <cheeser> the shell has an enhanced syntax that supports creating ObjectIDs, dates, etc. like that.
[03:17:15] <cheeser> and, no, it isn't valid json.
[03:17:33] <cheeser> you can convert it to a document like { $oid : "534de25fce66b4cc1ab8fbe5" } though
[03:20:29] <in_deep_thought> cheeser, so in the database itself, should the documents look like {"blah":5,"_id":ObjectID("jl3l4js234234jlk")} or should they look like {"blah":5,"_id":"jl3l4js234234jlk")}
[03:20:41] <cheeser> the first, typically
[03:20:50] <cheeser> the second is just a string not an ObjectID
[03:21:10] <in_deep_thought> so how am I supposed to call item._ID ? its not valid JSON so it would throw an error, right?
[03:21:25] <cheeser> item._id is fine
[03:22:33] <in_deep_thought> and it will return ObjectID("jl3l4js234234jlk")? or "jl3l4js234234jlk"
[03:29:58] <in_deep_thought> cheeser, ?
[03:37:35] <in_deep_thought> why is it that when I type typeof ObjectId("507c7f79bcf86cd7994f6c0e").valueOf() into the try.mongodb.org it tells me its an object? I though the whole point was that valueOf() returns a string
[03:38:13] <in_deep_thought> ObjectId("507c7f79bcf86cd7994f6c0e").valueOf() seems to return itself
[03:50:23] <cheeser> doh! just missed him.
[04:09:25] <in_deep_thought> cheeser, Sorry my IRC quit. In any case, would a query of item._id return ObjectID("jl3l4js234234jlk") or "jl3l4js234234jlk" ?
[04:12:40] <cheeser> depends on the document, of course.
[04:12:59] <cheeser> but assuming the document uses an ObjectID for it's _id then you'd get an ObjectID
[04:13:32] <in_deep_thought> so with { "source" : "www.easyvn.net--45-best-real-world-full-hd-wallpapers--007.jpg", "_id" : ObjectId("534de25fce66b4cc1ab8fbe5"), "score" : 0, "__v" : 0 }, item._id would return ObjectId("534....")
[04:14:02] <in_deep_thought> and then I could do item._id.toValue to get the string itself?
[04:15:20] <cheeser> item._id.valueOf() will get you the string value
[04:15:25] <cheeser> or item._id.str
[04:16:22] <in_deep_thought> I have read that one of the whole points of MongoDB is that is stores everything in JSON format so that it is easily accesible. Is this not true when it comes to the autogenerated _id values
[04:16:56] <cheeser> no, it stores documents in BSON
[04:17:16] <cheeser> the shell has an extended syntax to make certain things nicer, e.g.
[04:18:39] <in_deep_thought> so everything is encoded in BSON, not just the _id field?
[04:19:02] <cheeser> correct
[04:55:13] <narutimateum> can someone help me..trying to install mongo in windows
[04:56:33] <joannac> you'll need to be more specific than that
[04:57:42] <narutimateum> i downloaded the installer instsalled then wat?
[05:01:27] <Alvein> hi all
[05:02:33] <Alvein> some question %) how to safety remove files from dbpath dir with dot? (dbname.7, dbname.8)
[05:03:03] <joannac> narutimateum: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-windows/#run-mongodb
[05:03:17] <joannac> Alvein: Um, why?
[05:03:23] <joannac> Alvein: short answer: don't
[05:06:27] <Alvein> joannac: because, they are big :) very big))
[05:06:47] <Jadenn> >_>
[05:07:05] <Alvein> joannac: and they date are more than half year ago
[05:08:42] <joannac> Alvein: open a mongo shell, use <dbname>; db.stats()
[05:10:53] <Alvein> joannac: slowly :) waiting...
[05:14:36] <Alvein> joannac: https://www.dropbox.com/s/msclzekeg87wz5f/Screenshot%202014-04-18%2011.14.02.png
[05:14:57] <Alvein> joannac: and what do with this information?
[05:16:33] <joannac> wow!
[05:16:39] <joannac> that's a lot of frangmentation
[05:16:48] <joannac> can you afford some downtime?
[05:16:56] <joannac> if so, run repairDatabase
[05:17:21] <joannac> and you might want to turn on usePowerOf2Sizes as well
[05:18:15] <Alvein> how do you understand that a lot of fragmintation of my db?
[05:20:35] <joannac> because your fileSize is a lot bigger than your storageSize
[05:21:25] <Alvein> joannac, in this database I'm very often replace data.. may be about this start fragmintation? Thanks for explanation
[05:21:51] <Alvein> he-he https://www.dropbox.com/s/yzryy2tee2hztma/Screenshot%202014-04-18%2011.21.17.png
[05:22:39] <Jadenn> why on earth do you have so many collections
[05:24:09] <Alvein> and I think.. why? :) may be in some time I think that is good idea for our site structure
[05:32:40] <joannac> Alvein: shut down the node and then mongod --repair [--otherOptions...
[05:33:16] <Alvein> joannac: ok, thanks! Good day :)
[07:19:37] <zeejan> why doesn't MongoDB have a proper IDE
[07:19:43] <zeejan> It really needs one
[07:21:19] <sweb> i want to know optimization about mongo... somthing like book mysql performance ebook
[07:21:39] <sweb> about query optimization... indexing and etc ... from who migrate from MySQL to MongoDB
[07:21:46] <sweb> any good guide for this ?
[07:27:24] <ranman> sweb: http://docs.mongodb.org/manual/core/query-optimization/
[07:57:52] <in_deep_thought> if I have a document that looks like this: {
[07:57:53] <in_deep_thought> "_id": {
[07:57:53] <in_deep_thought> "$oid": "534de25fce66b4cc1ab8fbf0"
[07:57:53] <in_deep_thought> },
[07:57:53] <in_deep_thought> "source": "www.easyvn.net--45-best-real-world-full-hd-wallpapers--018.jpg",
[07:57:53] <in_deep_thought> "score": 0,
[07:57:55] <in_deep_thought> "__v": 0}
[07:58:21] <in_deep_thought> how do I get the id field? is it item._id.$oid.valueOf()
[07:58:22] <in_deep_thought> ?
[07:58:24] <kali> in_deep_thought: please use a pastebin next time
[07:59:00] <kali> i don't understand your question.
[07:59:01] <in_deep_thought> kali, sorry I accidentally copied the paste text instead of the url
[08:00:21] <in_deep_thought> http://bpaste.net/show/N0XZSHsBq3j4WOTCZ1Ke/ this is my object. I want to return that long string. what is the way to do this? it looks like to me it might be Item._id.$oid but I am not sure if that is right? and then would I have to do .valueOf() on the result of that to get a string?
[08:02:07] <kali> in_deep_thought: what language are you using ? { $oid: } is just a json conversion trick, the _id is actually an ObjectId, so you should look for the interface of the ObjectId class in the language you're using
[08:03:36] <in_deep_thought> I am using javascript. It says that the convention is ObjectID.str to get the string out
[08:03:54] <kali> item._id.str
[08:04:25] <in_deep_thought> ok, so the whole $oid thing is just a convention?
[08:05:07] <kali> yes. it's a way of encode bson document in json. json has less types than bson, so...
[08:07:39] <in_deep_thought> do I have to call .str on the other objects I want to get? like if I want to get the url after source in that paste, would I need item.source.str?
[08:08:43] <kali> i don't think so, but my knowledge of js has its limits
[08:17:49] <bushart> Hi everyone. I have 60 millions documents. And i try group their by index field.... But it is a veeeeery long time...
[08:18:04] <bushart> How can I speed up this process?
[08:19:41] <bushart> He generally uses the index to groups?
[08:20:02] <kali> how do you group them ?
[08:20:16] <kali> with the aggregation framework ?
[08:20:19] <bushart> yes
[08:20:41] <kali> an index should help, yeah
[08:21:03] <kali> since 2.6, the aggregation pipeline has an explain()
[08:21:38] <bushart> Great idea. Now I will try.
[08:39:18] <bushart> crrrime solved: MongoDB cannot merge multiple indexes.
[08:43:01] <kali> bushart: in some cases, 2.6 should do it
[08:49:21] <rbnb> Do secondaries in a replica set maintain an oplog just like the primary?
[08:51:56] <bushart> kail: but if i use match and group? it go to index?
[08:53:45] <rbnb> Ah, found it. Yes, secondaries do maintain an oplog: http://docs.mongodb.org/manual/core/replica-set-oplog/
[08:54:36] <bushart> kail: Or is it separate stages?
[08:56:49] <Nodex> derp
[08:57:26] <bushart> Nodex: ?
[08:58:41] <rbnb> Are writes to the oplog journaled before being made visible to secondaries?
[09:01:35] <kali> bushart: it's "kali", not "kail" :)
[09:02:30] <kali> bushart: i don't know, i haven't spend enough time on 2.6 to get a good feeling on what the optimizer knows to do yet. but explain() should help
[09:03:28] <kali> bushart: then depending on what you're doing in the match and group, a composite index might be the solution, but you'll have to show me what you're doing
[09:06:02] <Nodex> pred
[09:56:12] <Guest53043> while importing json document in mongodb i got this error
[09:56:13] <Guest53043> exception:BSON representation of supplied JSON array is too large: code FailedToParse: FailedToParse: Date expecting integer milliseconds:
[09:56:32] <Guest53043> can anyone help me out with a solution
[10:47:23] <tinix> hi all
[10:48:06] <tinix> i have a question about unicode querying.... is there some way to coerce strings to a simpler character set when querying?
[10:48:39] <tinix> say a field is "Melé", could i query it by using "Mele"?
[10:48:59] <tinix> or would I need to convert it and store it that way on insertion?
[10:50:34] <kali> tinix: you need to store the "normalized" form at insertion
[10:51:26] <kali> tinix: or maybe use a text index
[10:51:43] <Derick> text indexes don't support this yet
[10:53:17] <kali> Derick: ha ok.
[10:55:27] <tinix> hmm... i'll probably just dump that data into another DB system them, and store the object IDs
[10:55:40] <tinix> that's a bummer
[10:55:55] <Derick> yes :-/ We're looking at collation for future versions though
[10:55:58] <tinix> good excuse to setup lucene though
[10:56:17] <Derick> tinix: May I suggest either elastic search or solr though?
[10:56:22] <tinix> sure thing
[10:56:34] <tinix> that's what i meant anyway (it's late haha)
[10:56:43] <Derick> I can't guess that :)
[10:56:54] <tinix> kiiinda the same thing, almost. :P
[11:05:57] <tinix> anyone know if redis does that type of coercion? i'm already using it for something else... if i can go w/o having to setup solr that'd be nice.
[11:07:05] <Derick> i doubt it
[11:07:06] <tinix> i have postgres running too, so that's another option.
[11:07:17] <Derick> postgres will support this though
[11:07:42] <tinix> yeah
[11:08:19] <tinix> i guess i'll use an update hook or something to keep the RDB up-to-date
[11:08:56] <Nodex> redis treats everyhing as binary afaik
[11:09:17] <rbnb> Are writes to the oplog journaled before being made visible to secondaries?
[11:22:21] <rbnb> I guess what I want to know is: does this image (reasonably) accurately portray MongoDB replication? http://daprlabs.com/blog/wp-content/uploads/2014/04/mongodb_repl.png << only one secondary is shown in that picture for simplicity
[11:36:02] <kali> rbnb: the "ok" is not necessarily send that late, it depends on the writeconcern
[11:36:28] <Derick> rbnb: not anymore, writes have changed in 2.6
[11:37:06] <Derick> (no more getLastError)
[11:37:12] <Derick> the rest looks fine, and what kali says
[11:42:26] <rbnb> Thanks, guys. Assuming that I want a writeconcern which gives a high consistency guarantee, when would it send? Would it be after journaling or application? I would assume after journaling is sufficient
[11:42:43] <rbnb> I'll read up on the 2.6 changes
[11:45:44] <rbnb> Derick, can I confirm that the only change in the flow for 2.6 is that getLastError is effectively implied by the write operation itself?
[11:46:42] <rbnb> i.e, insert() then getLastError() in 2.4 is equivalent to insert() in 2.6
[11:47:30] <kali> rbnb: there is no general answer to that. it's a matter or infrastructure reliability and risk assessment. ack with w=majority has some benefits for instance.
[11:48:24] <rbnb> That's fair enough, kali. We will assume I always pass w=majority
[11:52:24] <Derick> rbnb: yes, sortof
[11:53:11] <rbnb> That seems to be the case from the release notes
[11:54:05] <kali> rbnb: my point is, with w=majority, you *don't* wait for journaling
[11:54:29] <rbnb> kali, you apply the write to the data before journaling it?
[11:56:17] <kali> rbnb: (fyi, i'm not a mongodb developper) my understanding is in "ack" mode, the write goes to the journal, but you don't wait for the journal to be commited to disk
[11:56:47] <kali> rbnb: and if i understand correctly, that's what http://docs.mongodb.org/manual/core/write-concern/ says
[11:57:38] <rbnb> w=majority -> "enough systems are aware of this write and therefore you may as well consider it successful, even though it hasn't actually been made durable"
[11:58:29] <rbnb> why only "majority ack" and no "majority journaled" write concern mode?
[11:58:44] <rbnb> Also... the diagram scares me when it shows "apply" before "journal"
[11:58:54] <rbnb> Imagine if disk filesystems did that....
[11:59:03] <kali> because it's faster, and the risk of having more than one replica failing is low
[12:00:32] <rbnb> This is how the GFC happened... a bunch of random events were modeled independently, but there was little thought put to the correlation between them... like if the DC has a power shortage just after acking :P
[12:01:02] <rbnb> strong tongue-in-cheek, but there's some truth to it... the option to have a more durable write concern seems useful
[12:01:15] <kali> rbnb: as i said, it's a risk vs performance compromise
[12:02:07] <rbnb> Thanks, kali :)
[12:02:11] <kali> rbnb: some apps can tolerate higher latency, some other can tolerate some occasional miswrites
[12:02:48] <kali> rbnb: the good thing with mongodb is, you don't even have to choose at the application scope. you can pick request per request your durability policy
[12:02:58] <kali> rbnb: and that is f***ing great :)
[12:03:27] <rbnb> That's true, but for the apps which can tolerate higher latency, there is no "bet the business" option
[12:04:11] <rbnb> For the occasional "this write has to happen or we're f****d" operation, there doesn't seem to be a "truly durable" write concern level
[12:04:22] <rbnb> w:majorityJournaled would be good
[12:05:04] <rbnb> also, this image scares the shit out of me: http://docs.mongodb.org/manual/_images/crud-write-concern-journal.png
[12:05:12] <rbnb> apply before write to journal
[12:05:37] <rbnb> Surely that's a misleading image
[12:05:59] <kali> rbnb: i thnk in that case, the write to journal should be understood as "commit journal to disk"
[12:06:54] <rbnb> as long as journal commits always occur before data flushes
[12:09:53] <kali> rbnb: seriously, mongodb does work. we - long time users and supporters - and the developpers are not that stupid :)
[12:10:13] <rbnb> I've used it a fair bit
[12:10:32] <rbnb> It's good
[12:10:47] <rbnb> but I had many issues at high write throughput in a replicated cluster
[12:11:20] <rbnb> Basically they amounted to the "effectively global" lock on the localdb
[12:12:25] <rbnb> and the replication mechanism itself, all writes going through the oplog, having secondaries tail that log
[12:12:35] <rbnb> then write that to their own copy of the oplog
[12:12:50] <kali> maybe it's time to shard ?
[12:12:51] <rbnb> having that itself written to the journal
[12:13:06] <h0rnet> prolly lacks ssds
[12:13:12] <rbnb> That's right
[12:13:13] <rbnb> no SSDs
[12:13:19] <rbnb> cloud environment
[12:13:28] <rbnb> Sharded to all buggery
[12:14:38] <rbnb> The whole system would be a heck of a lot faster & also more reliable if it was based on a distributed consensus algorithm (paxos, raft) rather than log shipping
[12:14:49] <rbnb> no need to put oplog on disk
[12:14:53] <h0rnet> isnt it the same problem with other db's
[12:15:30] <h0rnet> mysql writes bin logs etc
[12:15:43] <rbnb> MySQL is not designed for distribution
[12:16:05] <rbnb> Some DBs use a distributed journal
[12:17:09] <h0rnet> if the oplog is mem only, hows it going to know where replication left off when you boounce the box
[12:17:34] <kali> so. you don't want to shard, you don't want to go ssd, you don't want to relax the write concern
[12:17:38] <rbnb> Use the journal instead, or replace the journal with the oplog
[12:17:40] <Derick> https://github.com/dwight/mongo/commit/6cc0ea630cd8ad44c45e0a6f91650289346b6c2f
[12:17:44] <rbnb> It does shard
[12:18:15] <rbnb> I think there were 18 shards * 3 replicas
[12:19:36] <kali> Derick: is that 2.6 ?
[12:19:45] <h0rnet> sounds like you are missing some performance settings somewhere to me
[12:19:48] <Derick> kali: no, that's experimental
[12:19:54] <kali> Derick: ok
[12:20:02] <Derick> beginning of page level locking
[12:20:18] <rbnb> page level locking will help a lot
[12:21:08] <rbnb> but it still seems that ever write to a primary causes 4 things to be written on that primary rather than 2 (journal oplog, write oplog data, journal target, write target data)
[12:21:35] <Derick> those writes are into *memory* though
[12:21:58] <rbnb> Derick, which ones of those are in mem? Don't they all get persisted?
[12:22:43] <Derick> sure, they're backed by disk - they're memory mapped files
[12:22:55] <Derick> but the writes you speak of don't necessarily hit disk
[12:27:29] <rbnb> But they do all need to hit the disk for data to be durable (and they will all be in separate files, iirc, so four write ops) - or am I mistaken there?
[13:09:52] <sweb> is mongodb default accept all connection from out of machine ?
[13:47:56] <bushart> Hi everyone.
[13:47:57] <bushart> Grouping uses indexes?
[13:51:02] <bushart> I try to make the grouping index, but it is very long. And explain show only BasicCursor :(
[13:54:46] <Nodex> grouping in the aggregation framework does. Not sure about normal grouping
[13:58:14] <bushart> Nodex: How to create an index for this?
[13:58:47] <bushart> Nodex: I have tried different options, but failed.
[14:01:55] <boutell> anyone know how to get the version number of the server via the client? I’m use the node client, if it matters.
[14:14:34] <Kaim> hi all
[14:14:48] <Kaim> I'm doing some test with update+upsert=true
[14:15:06] <Kaim> but I can't go up to 250update/sec because of db lock...
[14:15:13] <Kaim> the lock is always at 150%...
[14:15:19] <Kaim> how can I improve that ?
[14:19:27] <Lujeni> Kaim, single server ?
[14:20:05] <bushart> I'm trying to reproduce this example: http://docs.mongodb.org/manual/core/aggregation-pipeline/#pipeline-operators-and-indexes but explains doesn't show that index is used for $group stage. Does it mean that index isn't used, or maybe explain just doesn't show it to me?
[14:21:13] <Kaim> Lujeni, yes
[14:22:00] <Lujeni> Kaim, iostat ?
[14:22:11] <Derick> bushart: group can't use an index, it makes no sense as the whole collection (or pipeline) is being read anyway
[14:22:30] <Lujeni> Kaim, index ?
[14:23:08] <Nodex> Kaim, what are your server specs?
[14:25:47] <Jadenn> how do i specify data type when updating a document
[14:26:07] <Jadenn> i updated a field using a mongo client and it saved as a double
[14:26:32] <Kaim> Lujeni, iostat are ok, 7% util
[14:26:38] <Kaim> I don't have index
[14:28:46] <Kaim> Nodex, VM with 4cores and 8GB RAM
[14:31:10] <Lujeni> Kaim, have u an example of your update query ?
[14:33:47] <Kaim> @db.collection(event.sprintf(@collection)).update(
[14:33:47] <Kaim> {
[14:33:47] <Kaim> "timestamp" => document["timestamp"],
[14:33:47] <Kaim> "domain" => document["domain"],
[14:33:47] <Kaim> "type" => document["type"],
[14:33:48] <Kaim> },
[14:33:50] <Kaim> {
[14:33:52] <Kaim> "$set" => {
[14:33:56] <Kaim> "timestamp" => document["timestamp"],
[14:33:58] <Kaim> "domain" => document["domain"],
[14:34:00] <Kaim> "type" => document["type"],
[14:34:02] <Kaim> },
[14:34:04] <Kaim> "$inc" => {
[14:34:06] <Kaim> "value" => document["value"],
[14:34:08] <Kaim> },
[14:34:10] <Kaim> },
[14:34:12] <Kaim> {
[14:34:14] <Kaim> :upsert => true,
[14:34:16] <Kaim> },
[14:34:17] <Kaim> )
[14:34:19] <Kaim> Lujeni, arf sorry for large paste
[14:34:55] <Lujeni> Kaim, ur query always use the combo ts/domain/type ?
[14:35:10] <Kaim> yes, I can make an index on it
[14:35:21] <Lujeni> Kaim, try :)
[14:35:36] <Lujeni> and check with an explain
[14:36:55] <bushart> Derick: Thank you.
[14:39:33] <Kaim> Lujeni, with index it's better I'm up to 2800update/sec with 30% lock
[14:39:34] <Kaim> :)
[14:39:48] <Kaim> what can I improve as well ?
[14:40:21] <Lujeni> check ur io stat now :)
[14:41:20] <Lujeni> and u have enought memory for your working set i guess
[14:44:01] <Jadenn> joannac: i didn't realize it yesterday, but yes the $and was supposed to be there, i was trying to replicate WHERE (friendUid = <id> AND confirmed = 1) OR uid = <id>
[14:44:21] <Jadenn> or maybe it wasm't, but yeah :P
[14:45:38] <megido> hi
[14:46:03] <Jadenn> no, that isn't correct, it's not getting any results for the (query AND query)
[14:46:04] <megido> i can compress data/db dir and use on other machine?
[14:46:22] <megido> if uncomress and set path to this dir
[14:47:02] <Jadenn> yes megido, mongod should be able to boot from it, providing all the files are intact
[14:47:25] <megido> Jadenn: thx
[14:47:52] <megido> Jadenn: mongo no can export compressed dump?
[14:48:19] <Jadenn> megido: http://docs.mongodb.org/v2.2/reference/mongodump/
[14:48:30] <Nodex> Jadenn : Mongodb uses an implicit and with queries
[14:48:48] <Jadenn> megido: i don't believe it is compressed however
[14:49:17] <Jadenn> Nodex: how would i go about the aforementioned query?
[14:49:18] <Nodex> db.foo.find({$or:[{friendUd:1234, confirmed:1},{uid:1234567}]});
[14:49:36] <megido> this what I want to convey to just archive folder
[14:49:40] <Nodex> either of those will satisfy the query
[14:49:48] <Jadenn> *covets that answer as the holy grail*
[14:49:50] <Jadenn> thanks!
[14:50:03] <Jadenn> that query has been bugging me since january
[14:50:06] <Nodex> http://docs.mongodb.org/manual/reference/operator/query/or/
[14:51:22] <Nodex> you will want a compound index on freindUid, confirmed
[14:51:25] <Nodex> and a normal index on uid
[14:52:23] <Jadenn> how much of a performance burden would i be looking at if i just used normal indexes
[14:52:42] <Jadenn> there are thousands of queries, i don't think i'm quite as ambitious as to compound all of the fields
[14:52:53] <Nodex> you don't need to compound them all
[14:52:56] <Nodex> just two of them
[14:53:13] <Nodex> db.foo.ensureIndex({freindUid:1,confirmed:1});
[14:53:13] <Jadenn> i know, but compounding the relational fields that each query needs
[14:53:30] <Nodex> right, well in 2.6 indexes can intersect
[14:53:49] <Nodex> I am yet to test their efficiency in the query planenr so I don't know how well they perform
[14:54:04] <Nodex> planner*
[14:54:39] <Jadenn> many thanks :>
[14:55:08] <Nodex> :D
[14:55:40] <Jadenn> erp spoke too soon lol
[14:56:41] <Nodex> if you're haing to write complex queries wiht complex indexes then you might want to think about changing your data structure
[14:56:52] <Jadenn> it will only retrieve results matching { uid: } http://puu.sh/8dEom.png
[14:57:43] <Nodex> you want ALL of the results i/e any in the brackets plus the ones outside?
[14:57:44] <Jadenn> well i had changed the data structure before, but i was running into even more problems
[14:57:57] <Jadenn> i need to replicate WHERE (friendUid = <id> AND confirmed = 1) OR uid = <id>
[14:58:12] <Nodex> that's wht the $or does
[14:58:16] <Nodex> what
[14:58:17] <Nodex> *
[14:58:40] <Jadenn> yes, but it won't match {friendUd:1234, confirmed:1}
[14:58:51] <anildigital> guys.. I want to map a tree kind of data structure in mongo... basically for multi level marketing scenario
[14:59:08] <Nodex> can you pastebin your query
[14:59:09] <anildigital> how mongo helps for such case
[14:59:31] <Nodex> http://docs.mongodb.org/manual/applications/data-models-tree-structures/ <---
[14:59:34] <Nodex> Google #1 result
[14:59:47] <Kaim> Lujeni, thx :)
[14:59:58] <Jadenn> http://paste.arviksa.co.uk/wabarupiboyixaho
[15:00:14] <anildigital> Nodex: yep.. I am reading that link.. but is mongo suitable for such kind of data?
[15:00:28] <anildigital> basically a parent with many nested nodes at n level
[15:00:29] <Nodex> friendUd <-- typo ?
[15:00:33] <Nodex> Uid ....
[15:00:55] <anildigital> I need to show the count of all the n nodes including nested .. for a parent.
[15:01:00] <Jadenn> ah yes it was a typo, but fixing it still returns only { uid: }
[15:04:18] <Jadenn> i have verified the tables do exist, just can't get it to select AND inside of the OR query
[15:04:23] <Jadenn> documents rather
[15:07:06] <Nodex> can you flip the $or queries
[15:07:12] <Nodex> the planner might be getting confised
[15:07:15] <Nodex> confused*
[15:07:56] <Jadenn> {$or:[{uid:ObjectId("532dabede1ac8a14108b4568")},{friendUid:ObjectId("532dabede1ac8a14108b4568"), confirmed:1}]}
[15:07:57] <Jadenn> no change
[15:09:25] <Nodex> what does this give you... db.foo.count({uid:ObjectId("532dabede1ac8a14108b4568")});
[15:09:47] <Nodex> and then what does this give you db.foo.count({friendUid:ObjectId("532dabede1ac8a14108b4568"), confirmed:1});
[15:10:40] <Jadenn> i'm cursing on my side of the computer :<
[15:10:47] <Nodex> haha
[15:10:50] <Jadenn> the confirmed data was stored as a string
[15:10:53] <Jadenn> not an int
[15:10:57] <Nodex> :/
[15:11:00] <Nodex> that would do it!
[15:11:03] <Jadenn> YUP
[15:11:15] <Nodex> you'd be surprised how many people forget to cast things
[15:11:26] <Nodex> it's probably the number one mistake made in mongo
[15:11:54] <Jadenn> well when i imported the data from mysql i casted everything *but* the confirmed field
[15:12:27] <Nodex> LOL
[15:12:37] <Jadenn> luckily mongo takes mere seconds to change data so ill just go in and foreach everything int
[15:14:22] <Joeskyyy> Just playin' around with stuff here to preface, I know elasticsearch does more what I'm looking for, BUT. Anyone know if you can do a partial text search using the new fancy $text op?
[15:14:37] <Joeskyyy> i.e. "coffe" and "coffee" return similar documents
[15:14:49] <Nodex> I think $text has stemmming in it
[15:14:59] <Nodex> never played with it myself
[15:15:15] <Joeskyyy> It's pretty agile I gotta say.
[15:15:31] <Joeskyyy> Only thing it's lacking, or at least obviously from the docs outright, is partial searching.
[15:16:25] <Joeskyyy> Nodex: Not enough coffee (heh…) yet, stemming?
[15:17:01] <Jadenn> now the only issue is to get the php driver to show the same results <_>
[15:20:50] <Jadenn> at this point i can't really tell what or what isn't supported with the php driver
[15:21:06] <Jadenn> even if there are any unsupported operators :S
[15:21:20] <Joeskyyy> Stemming, for deriving the stem of a word. Well if that wasn't obvious.
[15:21:40] <Jadenn> this should be the equivalent to that query [ '$or' => [ [ 'friendUid' => new mongoId($id) ], [ 'uid' => new mongoId($id) ] ] ]
[15:26:35] <boutell> is there really no way to detect the mongodb server version?
[15:26:39] <boutell> from the client, that is.
[15:26:45] <boutell> really need to know if $text is a thing
[15:27:17] <Jadenn> boutell: http://docs.mongodb.org/manual/reference/server-status/
[15:27:48] <jet> Is there a way to not create automatically a database? I would to be able to create a database only with the create command
[15:28:11] <boutell> Jadenn: thanks, that is helpful. Need to figure out how to invoke that command from the node client now.
[15:28:43] <boutell> looks like db.command
[15:30:38] <Jadenn> jet: how do you mean automatically? as far as i know, it can only be created by command
[15:31:09] <jet> if you get use a database that doesn't exist, it gets magically created, no ?
[15:34:14] <Jadenn> the data online about that is a bit sparse, but i'm sure someone else here knows
[15:36:45] <Nodex> Joeskyyy : stemming stems words into their base ngram
[15:37:31] <Nodex> Jadenn : you need another part to that in your qquery
[15:37:52] <Jadenn> like what? o.o
[15:38:13] <Jadenn> i'm just running a basic find()
[15:38:31] <Nodex> $query=array('$or'=>array(array('friendUid'=>new MongoId($id),'confirmed'=>(int)1),array('uid'=>new MongoId($id))))
[15:39:30] <Jadenn> php is dynamically typed, int is inferred without quotes
[15:39:39] <Nodex> it's not
[15:39:48] <Nodex> 1 == "1" without casting
[15:40:04] <Jadenn> hasn't broken any of my other queries o.o
[15:40:17] <Jadenn> in fact i never cast
[15:40:45] <Jadenn> either way i excluded confirmed: 1 on purpose
[15:41:08] <Jadenn> that query should return all the results plus the ones that aren't confirmed
[15:41:20] <Jadenn> oh
[15:41:25] <Jadenn> i know whats wrong ._.
[15:42:09] <Jadenn> before i just glomped everyones friends into their own user object, and dealt with sorting and paging in php
[15:42:17] <Jadenn> i forgot to change the collection :<
[15:42:25] <Jadenn> TODAY IS NOT A GOOD DAY
[15:43:05] <dragoonis> I need to emulate this MySQL clause for a mongodb collection.. WHERE answer_id IN(2845, 2846)
[15:43:12] <dragoonis> can someone assist?
[15:43:40] <Jadenn> dragoonis: http://docs.mongodb.org/manual/reference/operator/query/in/
[15:44:23] <dragoonis> Jadenn, this is applying OR logic, from what I can see
[15:44:33] <dragoonis> I need rows that have both 2845 and 2846
[15:44:35] <Nodex> $in is the same as IN
[15:44:57] <Nodex> you're looking for $all
[15:46:02] <dragoonis> Nodex, okay, checking
[15:46:11] <Jadenn> alright. NOW we are using the right collection and everything is working
[15:46:15] <Jadenn> thanks Nodex :D
[15:48:09] <dragoonis> Nodex, take a look at this: https://gist.github.com/dragoonis/071680922eac5ac4f724
[15:48:21] <Nodex> soething has changed either in the driver or the PHP core in the last year because typing was always required
[15:48:41] <dragoonis> response_id: 40516 .. has entries for 2845 and 2844, but the $all doesn't find this entrty
[15:49:13] <Nodex> then you want $in
[15:49:20] <Nodex> you said you wanted all of them
[15:49:36] <Nodex> [16:44:16] <dragoonis> I need rows that have both 2845 and 2846
[15:50:07] <Nodex> none of your rows have both answer_id's because they're not an array
[15:50:17] <dragoonis> Nodex, I believe I need an array then.
[15:50:56] <dragoonis> Nodex, so response_id, should have an array of answer_ids ?
[15:51:20] <Nodex> i'm very confused as to what you want now
[15:51:47] <Nodex> db.answer_to_response_map.find({answer_id: { $in: ["2845", "2846"] } }).count(); <--- that will get you "WHERE answer_id = 2845 OR answer_id=2846"
[15:51:59] <Nodex> i/e the same as *sql IN()
[15:53:20] <Nodex> if your answers and responses are always integers then you should cast them as int's - you will save some space and sorting will work better
[17:42:19] <boutell> in thanks for an answer earlier, here is node code to verify the server is at least mongo 2.6:
[17:42:50] <boutell> https://gist.github.com/boutell/11055808
[17:43:05] <boutell> whoops, there’s a bug there (:
[17:46:05] <boutell> corrected version:
[17:46:06] <boutell> https://gist.github.com/boutell/11055808
[19:45:47] <Dimrok> Hi!
[19:46:31] <saml> hi Dimrok
[19:48:41] <Dimrok> I'm having trouble with a request, so I would like to know if you'd mind giving me a hand? I want to atomically find or modify a document.
[19:49:22] <Dimrok> (python driver by the way)
[19:49:54] <Dimrok> My case is: I want get a field from a document (User that I get from users.find_one({'email': email}). The problem, it's a new field I want to add if not present in the document.
[19:52:25] <saml> Dimrok, users.update
[19:57:40] <Dimrok> @saml yeah but how do I distinguish users that already have the field (to use this one) ? Is your answer to do something like: users.update({"email": email, field: Null}, {"$set": {field: The_New_Value}}) every time?
[19:58:32] <saml> {'email':email, 'newfield':{$exists:False}}, {'$set':{'newfield': value}} ?
[19:58:40] <saml> isn't email unique?
[20:03:27] <Dimrok> I'm scared of: -a race condition / -a cpuvore function. I explain my self: I want to compute a special hash per user, based on there email and some other data if they don't have one. The problem if I use a python driver that will compute the hash before knowing if the {"$exists: False} is positively evaluated by mongo.
[20:04:27] <Dimrok> So I'll probably have to use a non atomic request, but's ok. :)
[20:05:56] <saml> Dimrok, not clear what you're trying to do
[20:07:35] <tscanausa> Dimrok, why dont you start off with what problem are you trying to solve and then we might be able to help?
[20:09:18] <Dimrok> Sorry: > Imagine that database: [ {"email": "foo@bar.io"}, {"email": "bar@foo.io", "hash": "4242} ]. I want a function that return the hash field of the user (compute it if not present). so : method("bar@foo.io"} will return directly "4242, method("foo@bar.io") will compute the hash, store it in the document and return it.
[20:11:49] <Dimrok> So what I really want is an atomic function that find_OR_modify my document (and lazily evaluate the update argument).
[20:12:42] <kali> Dimrok: well, you don't have much choice but writting exactly what your describe. your computation has to be application side, as mongodb has no provisions for doing that kind of stuff, so you'll need two request anyway
[20:13:44] <kali> Dimrok: as long as your hash is a function of other fields of the document, your compute-and-store branch will be idempotent, so the worst that can happen is having two client compute the value at the same time, and getting the same result
[20:15:13] <Dimrok> In fact, as long as my parameters are the same, the hash function should be idempotent so it should work.
[20:29:48] <aaronds> Hi I'm trying to get the id of a newly inserted object using the Java API. It looks like I should be able to use getUpsertedId(); on the returned WriteResult object but it seems to always return null. Running server 2.6 so this should work. Anyone got any ideas?
[20:32:14] <kali> aaronds: the usual approach is to set the _id yourself. just call new ObjectId(), that's exactly what the client does anyway
[20:35:40] <aaronds> Ah ok, use that on the insert so I've already got a reference to it kali?
[20:35:50] <kali> yes
[20:35:56] <aaronds> cheers
[20:36:42] <aaronds> kali: just out of interest how does creating a new ObjectId ensure it's unique?
[20:36:55] <kali> aaronds: sheer luck
[20:37:13] <aaronds> lol
[20:37:30] <kali> aaronds: seriously: http://stackoverflow.com/questions/4677237/possibility-of-duplicate-mongo-objectids-being-generated-in-two-different-colle
[20:38:31] <aaronds> interesting cheers
[21:30:30] <saml> I have two databases: articles and images (not in the same database). can I do a query like: find all articles whose image is larger than certain size? article example: {img: 'foo/bar.jpg', _id: 'a/b.html'} image example: {_id: 'foo/bar.jpg', width:100, height:200}
[21:31:29] <saml> currently, do, articles.articles.find({img:{$exists:1}}) and for each doc, query images.images.find({width:{$gt:100}})
[21:37:40] <tscanausa> do your images have meta data about the article?
[21:44:05] <saml> tscanausa, what do you mean? images have image metadata. article read from it too
[21:44:14] <saml> to render image credit, alt text... etc
[21:44:29] <saml> i think i should repeat those into article
[21:44:45] <saml> but different team is managing images. image metadata.. etc
[21:45:16] <saml> so when they update image metadata, need to find all articles that use the image and update metadata on the article, too
[21:59:23] <tscanausa> so it your image collection has a field called article then it is really easy to do other wise you need to loop through all of your article collect your image ids and then go fetch the info from the images collection
[23:23:52] <LetterRip> I'm doing a $project and the field isn't appearing, here is my code and an example data and my output
[23:23:53] <LetterRip> http://www.pasteall.org/50935/python
[23:25:04] <LetterRip> it is an aggregator query