pmxbot IRC Log Viewer

[00:36:20] <metasansana> just finished mongo for dba week 2 :)

[00:47:51] <Derick> yay! way to go metasansana

[00:49:12] <metasansana> thanks :)

[00:51:43] <Derick> i should probably follow that course too :-)

[00:52:43] <BurtyB> metasansana, is it language agnostic?

[01:01:53] <Derick> BurtyB: afai, it's all javascript shelll

[01:03:32] <BurtyB> Derick, ah - not as bad as the "for developer" ones then (well unless one of the languages they offer is the one you use)

[01:05:23] <Derick> out of curiosity... which language are you missing?

[01:08:33] <Derick> heh

[01:08:36] <Derick> it's ok :)

[01:08:49] <Derick> BurtyB: it's also a good introduction to python really

[01:11:57] <BurtyB> Derick, I did signup when they first announced them but had no clue about python and making it work with apache so gave in when they expected it to be working

[01:12:24] <metasansana> BurtyB, mostly shell so far yes but I believing there are parts related to deployment like clusters and sharding etc.

[01:13:44] <Derick> BurtyB: I didn't think apache has anything to do with it?

[01:15:41] <metasansana> I did the one for nodejs developers as well it was mostly easy.

[01:16:16] <metasansana> In my opinion it barely touched on nodejs specific stuff.

[01:24:13] <BurtyB> Derick, iirc it was actually bottle

[04:07:24] <bin> okay guys here is the question. I have an array of objects. How to specify dynamically (based on the value of some property) to which element of the array to update its property

[05:48:25] <Xzyx987X> there appears to be a rather gaping flaw in ruby's mongo library. say you need to have some code run while mongo is write locked. and say you have to run the code in a critical (exclusive) thread, because in ruby's thread model a global hang will occur if another thread tries to write to the database while it is locked. that would be fine, except say there is a bug in ruby's mongo library that causes queries to hang if excecuted i

[05:48:25] <Xzyx987X> n an exclusive thread

[05:48:32] <Xzyx987X> what should you do then?

[06:02:29] <Xzyx987X> also, if I excucute a lot of queries one after another, it makes my program exit without even throwing an error

[06:02:36] <Xzyx987X> so that's a thing...

[06:02:59] <Xzyx987X> anyone experienced in mongo/ruby feel like looking at this?

[06:17:52] <Xzyx987X> ok, now I've figured out how to get the program to hang when executing a read query after write locking the database, even without excecuting it in a critical thread

[06:18:03] <Xzyx987X> that's progress, I guess?

[06:53:32] <Neptu> hej someone knows how i can drop a collection from the driver?

[06:54:15] <Neptu> python driver, I mean i do not find any specific method so I might use a command??

[07:11:38] <Xzyx987X> ok, I guess to generalize my question here, what might cause mongodb to hang on a find query when write locked?

[07:13:07] <Xzyx987X> I've isolated all the variables, and have determined that all my issues go away if I don't attempt to write lock the database, but I still don't understand how a write lock would cause a query to hang that only reads

[07:42:49] <mboman> Hi guys. I have some issues when I want to update (upsert) a record. Python code looks like this: result = self.db.malware.update({'sha1': sha1sum}, malware, upsert=True) and the error I get is InvalidDocument: key '$oid' must not start with '$'. Suggestions?

[07:55:48] <liquid-silence> anyone around

[08:02:18] <liquid-silence> mboman did you get sorted?

[08:02:37] <mboman> liquid-silence, no, still having the issue

[08:03:07] <liquid-silence> can you paste the complete query

[08:04:46] <mboman> liquid-silence, sure. I'll stick it in pastebin

[08:06:20] <mboman> http://pastebin.com/FmArsJYV

[08:07:27] <liquid-silence> check your pm

[08:08:02] <liquid-silence> which line is the issue

[08:08:42] <liquid-silence> 163?

[08:09:57] <liquid-silence> are you trying to update a specific object?

[08:14:11] <liquid-silence> ok sorry dude I can't help

[08:14:21] <liquid-silence> can;t even figure out my own shit

[08:15:13] <joannac> mboman: you have a key starting with a '$', which is reserved in MongoDB

[08:15:45] <joannac> I'mguessing in the variable "malware"

[08:15:48] <mboman> joannac, I can't find it. I've been dumping my dict w/o finding it

[08:16:17] <joannac> dump and paste?

[08:16:24] <joannac> dump just before the update

[08:17:00] <liquid-silence> joannac I still fighting with this tree structure stuff

[08:17:02] <liquid-silence> :(

[08:19:09] <liquid-silence> joannac care to point me in the correct direction?

[08:20:21] <Xzyx987X> *sigh*, I don't suppose either of you know why a write lock would cause a read query to hang do you?

[08:20:33] <mboman> joannac, I modified my remove_dots() routine to also remove $

[08:21:51] <liquid-silence> ok guys, I need to build a hierarchy tree, where users have a "container" (similar to your home folder in linux), where you can create folders and sub folders and upload files

[08:22:13] <joannac> Xzyx987X: there's one lock. While the write lock is taken, no reads can progress.

[08:22:23] <liquid-silence> also you need to be able to give access to someone else to access a file/folder, read, write, browse

[08:22:40] <joannac> liquid-silence: I haveno idea where you're stuck.

[08:22:58] <liquid-silence> joannac to be honest the whole schema is the problem currently

[08:23:14] <joannac> Then maybe you should rethink your schema?

[08:23:14] <liquid-silence> and I am still learning queries with mongo etc...

[08:23:30] <liquid-silence> well I am not sure how to translare a sql schema to mongo

[08:23:39] <liquid-silence> s/translare/translate

[08:23:48] <joannac> Or, do it the brute force way, and see how bad it is, and that'll help guide you to how to make it better.

[08:24:00] <Xzyx987X> then why is it when I ran my code in the debugger or irb it would process read queries after write locking?

[08:24:10] <Xzyx987X> ok, but putting that aside...

[08:24:49] <Xzyx987X> if a write lock isn't the anser, how can I query two collection while ensure no writes update then while running the query?

[08:25:02] <joannac> implement your own locking.

[08:26:02] <Xzyx987X> so what you're saying is, mongo has no useful built in locking capability whatsoever? wonderful...

[08:26:23] <joannac> Xzyx987X: http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/

[08:26:42] <liquid-silence> how does one update to a specific item

[08:26:53] <liquid-silence> db.containers.update({ "folders._id": ObjectId("525baa0cf70dc4d6b7ceecc0")}, { $push: { "files": { _id: ObjectId......

[08:26:58] <liquid-silence> this adds it to the top level

[08:27:05] <liquid-silence> and not below the folders

[08:27:10] <Xzyx987X> k, I'll give that a read. looks promising

[08:27:19] <joannac> Xzyx987X: sorry, misread. you want to read 2 collections and not allow writes while you're doing so?

[08:28:10] <joannac> Xzyx987X: actually, no, that article's still useful for you.

[08:29:20] <joannac> liquid-silence: then push folders.XXX ?

[08:29:51] <liquid-silence> db.containers.update({ "folders._id": ObjectId("525baa0cf70dc4d6b7ceecc0")}, { $push: { folders.files: { _id:

[08:29:57] <liquid-silence> on Oct 14 10:28:58.126 SyntaxError: Unexpected token .

[08:31:51] <liquid-silence> db.containers.update({ "folders._id": ObjectId("525baa0cf70dc4d6b7ceecc0")}, { $push: { folders.files: { _id: ObjectId(), name: "test.mp4" }}})

[08:31:54] <liquid-silence> the full query

[08:33:04] <Xzyx987X> joannac, it sort of is, I'm not really sure how to apply the information to my code though. here are the two queries I need to make run without any write occuring between them:

[08:33:07] <joannac> What's folders? an array or a subdoc?

[08:33:08] <Xzyx987X> collection.find({}).to_a

[08:33:24] <Xzyx987X> entry_collection.find({"refs" => 0}, {:fields => ["_id"]})

[08:34:31] <joannac> Xzyx987X: similar principle. Have a collection with a "I'm reading flag". Set the flag (if not set) when you start your read. Any write should check if that flag is set; if so, wait.

[08:34:45] <liquid-silence> it does not exit yet joannac

[08:35:37] <liquid-silence> exists

[08:35:38] <joannac> oh

[08:35:47] <liquid-silence> the files node does not exist

[08:36:54] <Xzyx987X> ok, so I guess a gc flag in my entry collection would do the trick?

[08:37:42] <joannac> liquid-silence: oh duh.you need to quote a string with a . in it". Also you may need an upsert flag?

[08:37:44] <Xzyx987X> but then if I find that it's enabled, from what I understand the only way to know when garbage collection is complete is to keep polling the flag

[08:37:57] <Xzyx987X> that's workable, but not very efficient

[08:38:19] <liquid-silence> db.containers.update({ "folders._id": ObjectId("525baa0cf70dc4d6b7ceecc0")}, { $set: { "folders.files": { _id: ObjectId(), name: "test.mp4" }}}, true.

[08:38:33] <liquid-silence> can't append to array using string field name: files

[08:38:35] <liquid-silence> wtf

[08:38:41] <Xzyx987X> depending on the amount of pending write queries, the traffic generated by polling could be rather large

[08:39:02] <liquid-silence> gah this thing is going to drive me insane

[08:39:04] <liquid-silence> FFS

[08:41:08] <joannac> liquid-silence: {upsert:true}, also closing brace

[08:41:10] <liquid-silence> this is the correct way

[08:41:12] <liquid-silence> b.containers.update({ "folders._id": ObjectId("525baa0cf70dc4d6b7ceecc0")}, { $set: { "folders.$.files": { _id: ObjectId(), name: "test.mp4" }}}, true

[08:41:35] <joannac> Also $push

[08:42:19] <Xzyx987X> joannac, so is there any way the run a mongo query that will block until a certain condition is satisfied to avoid the polling issue?

[08:43:38] <joannac> Xzyx987X: Erm, no. What if the query gets stuck? You want to take down the whole server?

[08:44:43] <Xzyx987X> well, prefferably it would only block the thread in which it's executed...

[08:45:01] <joannac> Everything will yield: http://docs.mongodb.org/manual/faq/concurrency/#does-a-read-or-write-operation-ever-yield-the-lock

[08:46:45] <joannac> I'm out; you guys can help each other :)

[08:47:47] <Xzyx987X> I'm probably not going to be much help... I've been working with mongo for all of three days now...

[08:59:08] <liquid-silence> joannac db.containers.update({ "folders.folders._id": ObjectId("525bb112f70dc4d6b7ceecd3")}, { $push: { "folders.$.folders": { _id: ObjectId(), name: "Sub folder" }}})

[08:59:17] <liquid-silence> why is that not nesting it under the current folder

[09:01:50] <Xzyx987X> haha, wow, I just realized that if I changed the order of the queries, it actually wouldn't matter in this particular case if the data was updated between then >.<

[09:02:04] <Xzyx987X> and now I'm off the bang my head against the wall for an hour...

[09:05:11] <liquid-silence> gah starting to hate this

[09:05:17] <liquid-silence> feel completely dumb :P

[09:11:23] <Xzyx987X> you feel dumb? I just spent the past four hours trying to solve a problem that could have been solved by switching to positions of two lines of code

[09:13:48] <jackblackCH> hi, anyone know know how to define the encoding when using mongoexport --csv ? i have special chars which looks bad in my csv

[09:14:11] <Kim^J> Don't use CSV? :S

[09:14:34] <jackblackCH> I'm swiss and ö, ä, and ü are replaced with kind of _Ad3f

[09:14:47] <jackblackCH> hmm you know an alternative?

[09:14:57] <Kim^J> What are you trying to do?

[09:15:43] <jackblackCH> i have a collection of users with email addressees , they registered on the web. now my project manager needs a list

[09:15:47] <jackblackCH> a typical export

[09:15:48] <liquid-silence> db.containers.update({ "folders.folders._id": ObjectId("525bb56df70dc4d6b7ceecf7")}, { $push: { "folders.$.folders.folders": { _id: ObjectId(), name: "Third Level" }}})

[09:16:00] <liquid-silence> why is that not going in the folder that I have quried

[09:16:06] <Kim^J> jackblackCH: Uhm, don't use mongoexport for this, use a custom program/script.

[09:16:20] <Kim^J> jackblackCH: Should be able to whack up something in Python.

[09:16:44] <jackblackCH> ok thanks for the info - then i write a script :(

[09:17:05] <jackblackCH> hoped i can define anywhere

[09:17:10] <liquid-silence> this is so annoying

[09:17:10] <Kim^J> jackblackCH: Then you're free to handle encoding how you like. And format and suck.

[09:17:11] <liquid-silence> FF

[09:17:13] <Kim^J> such*

[09:17:43] <jackblackCH> :) i see - i hate swiss special chars from now on

[09:17:49] <jackblackCH> not made for we

[09:17:50] <jackblackCH> b

[09:18:01] <jackblackCH> english for all please, thanks guys

[09:18:06] <jackblackCH> :) i gonna write it

[09:18:58] <Kim^J> liquid-silence: Can you reorganize and use a dictionary instead?

[11:05:11] <Neptu> can mongodb python driver drop a collection or I need to use command instead?

[11:34:57] <Nomikos> leifw: if you remember that search/sort found-count discrepancy I had last week, turned out the index on that field had only indexed about half the documents >.>

[11:35:31] <Nomikos> removing index, all 700 found (sort() wasn't adding docs, the search was just not finding them all)

[11:44:59] <bin> Hello guys! I have a question and it is: Can i use $elemMatch operator when i want to update particular element in an array and if yes , could you give me an example please. Cheers in advance. ( Couldn't find example in google)

[11:58:48] <joannac> bin: http://pastebin.com/udssG9Q6

[11:59:06] <bin> cheers mate

[12:01:01] <Nomikos> joannac: is there some way to use the new Text Search for the $match part of the aggregation pipeline? the regular index fails because the field contents are too long sometimes..

[12:01:29] <Nomikos> and therefor only returns about half the documents.

[12:01:43] <Nomikos> if I drop the index $match returns the correct set

[12:04:47] <joannac> Nomikos: afaik no

[12:05:09] <Nomikos> bummer

[12:05:15] <joannac> why do you need both?

[12:08:04] <Nomikos> I'm wanting to use the aggregation pipeline to show found-docs-per-category, but I guess I'll have to run that one without an index then.

[12:08:44] <Nomikos> the actual search results showing up in the main part of the page could then still use the new Text Search

[12:10:36] <bin> joannac: sorry to bother ya i will give you an example .. tried to align them properly. -> http://pastebin.com/95b711kR

[12:10:57] <bin> hope you understand it

[12:28:35] <bin> joannac: got it ... you are the best ;)

[13:03:52] <_Heisenberg_> Hi folks. I have a problem with the readPreferred settings. Even if I use a command like db.products.find().readPref (’primaryPreferred’).count() the operation blocks while the cluster is doing a failover and answers after a new primary is elected. I was expecting that the secondaries answer the query even if the cluster is performing a failover?

[13:04:31] <_Heisenberg_> Same for my node.js application where I set the readPreference in the application of course...

[13:06:09] <_Heisenberg_> My Cluster contains two shards, which consist of 5 nodes each

[13:06:52] <_Heisenberg_> MongoDB shell version: 2.4.5

[13:22:52] <kali> _Heisenberg_: during failover, everything is more or less on hold until the election whatever the read preference is

[13:27:45] <_Heisenberg_> kali: the documentations says something different: "In most situations, operations read from the primary member of the set. However, if the primary is unavailable, as is the case during failover situations, operations read from secondary members." ( http://docs.mongodb.org/manual/reference/read-preference/#primaryPreferred)

[13:28:29] <BlackPanx> does mongodb have to bind on localhost ip ?

[13:28:43] <BlackPanx> or can be only internal ip like 192.168.xx.xx

[13:28:49] <BlackPanx> without localhost ip

[13:29:38] <Neptu> hej quiestion about the python driver... I do nto find a method the drop the collection, should I use command or maybe Im mistakend and there is a method on the mongoClient?

[13:29:49] <_Heisenberg_> kali:

[13:29:57] <_Heisenberg_> ooops sorry

[13:30:02] <Neptu> second question is proper to use mongoclient against a replica or or I need the mongoclientReplica ?

[13:31:06] <kali> _Heisenberg_: mmm ok. this does not match my impressions and experience

[13:31:46] <_Heisenberg_> kali: Thats my problem! As I'm using mongo in my thesis I have to make sure that this behaviour is not my fault ^^

[13:33:14] <kali> _Heisenberg_: you may want to try it with another client as the node.js one may not be the most "compliant" of them. try the java one for instance

[13:34:15] <_Heisenberg_> kali: I think that is not needed since I watch that behaviour even in the mongos shell :/

[13:34:34] <_Heisenberg_> I'm just curious if I missed a setting or something

[13:36:02] <_Heisenberg_> I'm trying to set diaglog=3 in my mongos config file for a jira issue but if I put that parameter the restart of mongos fails. suggestions?

[13:40:58] <jyee> and nothing in your error log about the failed start?

[13:41:24] <bin> anyone working with mongodb and java ?

[13:46:21] <Danielss89> Hi

[13:46:45] <Danielss89> when i use find() on my collection i get some documents but i have to type "it" to show more.. can i get it to show all instead?

[13:50:10] <Nodex> no

[13:50:17] <Nodex> well yes with a forEach(

[13:50:40] <Nodex> find() returns a cursor which must be itterated

[13:50:45] <kali> mmm i think there is a way to change the batchSize somehow

[13:51:32] <kali> ha, it's not the batchSize

[13:52:03] <Nodex> lol

[13:52:46] <Danielss89> DBQuery.shellBatchSize = 300

[13:52:52] <Danielss89> would that show me 300 at a time?

[13:53:24] <kali> Danielss89: you can try :)

[13:53:30] <Danielss89> yeah

[13:53:33] <Danielss89> side question

[13:53:50] <Danielss89> nothing to do with mongo thoug, but you might know anyway

[13:54:03] <Danielss89> using terminal

[13:54:06] <Danielss89> how can i clear history

[13:54:18] <Danielss89> clear and reset removes it, but when i search it still shows old

[13:54:28] <Danielss89> i want it to be like if i open a new tab

[13:54:29] <Nodex> "clear"

[13:54:35] <Nodex> minus the quotes ;)

[13:54:42] <Nodex> oh bash history ?

[13:54:48] <Danielss89> well

[13:54:49] <Nodex> history -c

[13:54:50] <Danielss89> hmm

[13:54:57] <Nodex> and \n is not punctuation :)

[13:55:22] <Danielss89> i changed my "." key to be \n

[13:55:23] <Danielss89> :D

[13:55:30] <Nodex> also you might want to echo > ~/.bash_history

[13:55:51] <Danielss89> ok thx

[14:32:40] <_Heisenberg_> kali: fyi, the problem with the readpreference was already reported: https://jira.mongodb.org/browse/SERVER-7246

[14:52:00] <tripflex> anybody around?

[14:52:07] <tripflex> I have a question about aggregation and using day operators

[14:52:10] <tripflex> like $dayOfMonth

[14:52:53] <tripflex> i'm passing a unix epoch timestamp and having issues with bson vs string data

[15:08:26] <Derick> moin

[15:09:44] <PiyushK> hey .. any considerations / pointers to consider while choosing between mongodb and couchbase .?

[15:21:47] <Nodex> PiyushK : right tool for the job :)

[15:25:17] <PiyushK> Nodex, for personalization service and recommendation engine to store and analyze web usage data ...

[15:25:55] <Nomikos> I just found http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis and .. whoa.

[15:26:13] <Nomikos> ofcourse there are other pages like that

[15:46:16] <theCzar> Mongodb n00b here. I'm working on setting up some scripts to configure data migration for MongoDB from one host to another. With databases like Redis, in the past I have just told one running instance to become the slave of the other, waited for them to get in sync, and then shut off the master. What is the best way to do this in MongoDB? I have only seen a way to configure master-slave replication at startup.

[16:02:57] <jiffe99> there something wrong with this statement? db.orders.remove({'coin':'INF','$lt':{'timestamp':1381766396}})

[16:03:13] <jiffe99> after running it db.orders.find({'coin':'INF'}).pretty() shows entries with timestamps less than that value still in there

[16:04:47] <jiffe99> nm I think I have that backwards

[16:35:57] <clarkk> I have a collection of documents with a "category" field - thus multiple documents have the same category. I need to query the docs so that I get arrays of documents with the same category. Could someone give me a hint as to how I would go about this?

[16:36:13] <clarkk> it's very difficult to know what to google for

[16:41:11] <tripflex> clarkk: give me an example of your schema

[16:42:37] <Nomikos> can aggregation be used in any way with an indexer in front of MongoDB, or do those indexers have something similar to aggregation?

[16:43:17] <Nomikos> clarkk: can you sort by category and construct the arrays in code?

[16:43:47] <Nomikos> something like foreach (results as doc) data[doc.category].push(doc)

[16:44:09] <Nomikos> ..that might work without sorting..

[16:45:39] <clarkk> ok Nomikos - I will try that. Thanks

[16:45:42] <clarkk> thanks tripflex

[16:47:20] <clarkk> hmm, on second thoughts, it doesn't sound right

[16:47:31] <tripflex> clarkk: send me an example and i'll tell you what to do

[16:47:33] <clarkk> to do a foreach, it needs to be in an array

[16:47:47] <Nomikos> clarkk: I was kinda guessing as to what you wanted, sorry

[16:47:58] <tripflex> you can use aggregation to do this

[16:48:00] <Nomikos> if you elaborate a little that will help

[16:48:05] <tripflex> but without knowing your schema i can't tell you how to set it up

[16:49:05] <tripflex> you may not even need to use aggregation but it all depends on how your schema is setup

[16:56:47] <clarkk> tripflex: Nomikos my schema is very simple (although the dataset is much larger than this, obviously) http://pastebin.ca/raw/2466634

[16:57:07] <tripflex> oh that's simple

[16:57:15] <clarkk> so preferably I'd like arrays of categories

[16:57:28] <clarkk> so here there would be two arrays

[16:57:43] <tripflex> mymodel.find({ category: 'News' }, function (error, results) ...

[16:57:51] <tripflex> what language are you using

[16:57:54] <clarkk> tripflex: javascript

[16:58:02] <clarkk> tripflex: I want all categories

[16:58:27] <clarkk> and then I am going to print the titles out under their respective category

[16:58:40] <tripflex> so you want it to return an object that has an array of found values?

[16:59:17] <clarkk> eventually, yes, but I'd first like to know how to do this simple task

[16:59:46] <tripflex> { category: 'News', data: [ { title: 'Google News' }, { title: 'independent' } ] }

[16:59:51] <tripflex> like that

[16:59:53] <tripflex> ?

[17:00:30] <tripflex> explain to me what you're trying to accomplish

[17:00:31] <Nomikos> http://pastebin.ca/raw/2466636

[17:00:32] <tripflex> like end result

[17:00:36] <clarkk> tripflex: yes, as long as those items in the aray are whole documents

[17:00:41] <Nomikos> like that? probably late :p

[17:01:00] <clarkk> Nomikos: that looks ideal

[17:01:04] <tripflex> it will be

[17:01:17] <tripflex> the results will be the document it found with category News or whatever you use find for

[17:02:22] <clarkk> Nomikos: so what query operators should I be looking at?

[17:03:02] <clarkk> groupby seems to reduce the items down to one value (eg to count all items in a category)

[17:03:54] <tripflex> http://smyl.es/how-to-count-the-number-of-values-found-for-a-field-in-mongodb-using-node-js-and-mongoose-using-aggregation/

[17:04:02] <tripflex> depending on how deep you want to get into it

[17:04:06] <tripflex> aggregation may be what you want

[17:04:36] <clarkk> tripflex: I don't want to count the fields - I was giving an example of what I don't want

[17:04:56] <tripflex> yeah but gives you an example of aggregation

[17:05:02] <tripflex> at the bottom, the top video

[17:05:07] <tripflex> is a great explanation of how aggregation works

[17:05:14] <clarkk> oh ok :)

[17:05:26] <clarkk> ok I will look through that

[17:05:27] <tripflex> but like i said you may not need it

[17:05:28] <clarkk> thank you tripflex

[17:05:34] <tripflex> but i dont know what your end result is that you want

[17:05:53] <clarkk> tripflex: I want pretty much what Nomikos said

[17:06:14] <tripflex> ah

[17:06:24] <tripflex> so each category is the key

[17:06:30] <tripflex> and value is array of categories found

[17:06:36] <Nomikos> clarkk: this might work for you? http://pastebin.ca/2466638

[17:07:00] <Nomikos> the logic is in the code, not in mongodb, so it may not be what you were looking for..

[17:07:26] <Nomikos> the mongodb part would simply be a db.coll.find().. I'm pretty new to the db side of things >.>

[17:08:45] <Nomikos> if you sorted on category, then title, it might be simpler still to use the result set. simply set a var for this_category and compare that to category on each .. whatever it is you're doing with the loop

[17:09:01] <tripflex> nah just use aggregation and addtoset

[17:09:05] <tripflex> should be straight forward

[17:09:37] <clarkk> Nomikos: it's not clear how that code iterates through the categories

[17:09:56] <Nomikos> clarkk: probably best to look at tripflex's sources first :-)

[17:10:10] <tripflex> just a sec ill show you how to use aggregate

[17:10:20] <Nomikos> I set data as an example result set of find()

[17:10:34] <clarkk> if (! out[doc.category]) checks whether teh category field exists. However, they will always exist in those records

[17:10:44] <clarkk> Nomikos: understood

[17:10:45] <Nomikos> but I've only learned about aggregation end of last week, so..

[17:10:59] <clarkk> tripflex: it's ok - let me digest those resources you sent

[17:11:05] <Nomikos> clarkk: it checks if the index is set in the 'out' result set

[17:11:09] <clarkk> tripflex: solving the problem myself will help more

[17:11:12] <Nomikos> since there, it becomes an array

[17:11:53] <Nomikos> but what I've seen from the aggregation pipeline, it's pretty damn powerful

[17:12:20] <clarkk> hmm, I don't understand how that checks if the index is set

[17:12:51] <clarkk> in fact there is no category field in the out set

[17:12:54] <Nomikos> it just checks if it's truthy. if not, it initializes it to an empty array

[17:13:04] <Nomikos> that's why it checks first :-)

[17:13:26] <Nomikos> this is reasonably straight-forward JS

[17:14:02] <clarkk> yes, it is, but why do you have two result queries? out and data?

[17:14:07] <tripflex> http://glot.io/javascript/df45086d7b90f39f34d16378180eaa50

[17:14:11] <tripflex> that should give you an idea of how to do it

[17:14:27] <clarkk> tripflex: I;ll look at it after I've read the article. Thank you :)

[17:14:41] <tripflex> actually let me make sure code is right i did that quickly

[17:15:18] <tripflex> watch the second video on that link i sent you earlier

[17:15:23] <tripflex> that's specific to aggregate and grouping

[17:15:41] <clarkk> ok, great - thanks!

[17:16:33] <tripflex> yeah that's all you need

[17:16:37] <tripflex> should be correct

[17:16:47] <tripflex> you don't really need the "category" field but i added it just in case

[17:16:57] <tripflex> because the _id is set as category

[17:17:24] <clarkk> tripflex: not in my real dataset tho - I condensed it for brevity

[17:17:38] <tripflex> i know, but that's where you should start

[17:17:43] <Nomikos> tripflex: would it do a secondary group on 'category' in your example?

[17:17:44] <tripflex> so learn about aggregation and grouping

[17:18:27] <tripflex> it just sets category to the same thing as _id

[17:18:33] <tripflex> the id is what you're grouping them by

[17:19:25] <tripflex> so like, go through my model, group everything by the _id that equals $category, then add category field with $category, and title field. In title field add to the array in the title of current document

[17:20:27] <tripflex> but just like using find, you can group by multiple fields

[17:20:34] <tripflex> so instead of _id: "$category"

[17:20:37] <tripflex> you could do

[17:21:13] <tripflex> this: _id: { category: "$category", parent: "$parentcategory" }

[17:21:24] <clarkk> tripflex: ok, but I don't want to look at the code until I've gone through the article

[17:21:29] <tripflex> and it would group by documents that have the same cateogyr and parentcategory

[17:21:50] <tripflex> and returns a new document with those values

[17:22:03] <tripflex> oh trust me, i'm still learning aggregation but it is very powerful

[17:22:13] <tripflex> soon you will learn project, and match, the list goes on

[17:22:23] <tripflex> https://education.mongodb.com/

[17:22:27] <tripflex> sign up for a free mongodb class

[17:22:36] <tripflex> node.js one starts in the next few days

[17:22:51] <tripflex> unfortunate thing is there's not much information on the internet about aggregation framework

[17:24:15] <tripflex> but if you watch those two videos from my blog, it will give you a good understanding of it

[17:24:28] <tripflex> http://smyl.es/how-to-count-the-number-of-values-found-for-a-field-in-mongodb-using-node-js-and-mongoose-using-aggregation/

[17:25:25] <Nomikos> I love how OS X quicklook has a webpage preview for bookmark files

[17:25:57] <tripflex> good luck :)

[17:27:18] <Nomikos> "Prerequisites: Node.js" hmmmm.

[17:27:45] <Nomikos> ah well, there's the weekend.

[17:29:14] <tripflex> there's a couple of them ;-)

[17:29:41] <Nomikos> yeah, but there's only one before the course starts :)

[17:29:51] <Nomikos> and having never actually worked with node.js..

[17:30:55] <clarkk> gracias, tripflex :)

[17:31:57] <tripflex> np

[17:36:51] <liquid-silence> hi all

[17:36:52] <liquid-silence> http://pastebin.com/NyBswNjQ

[17:37:08] <liquid-silence> is there a way to pass in 525bdda1000000b975000089 and find the document with that name?\

[17:37:55] <liquid-silence> I dont know if this is possible

[17:38:03] <tripflex> you should restructre that

[17:38:13] <liquid-silence> but I dont want to do the dot annotation because from the client side it will not be great

[17:38:18] <liquid-silence> tripflex what do you suggest

[17:39:20] <tripflex> without doing the entire thing

[17:39:22] <tripflex> http://pastebin.com/vHdPSRJh

[17:39:23] <tripflex> then just

[17:39:48] <tripflex> liquid-silence: mymodel.findOne({ key: '525bdda1000000b975000089' } ......

[17:40:41] <tripflex> your using something that i assume will be dynamic as the key

[17:40:53] <tripflex> which will make it difficult if not impossible to run queries on

[17:41:04] <tripflex> if it's dynamic it should be set as a value of a key so you can query the key

[17:44:54] <rafaelhbarros> joannac: hello, friday I had to leave asap. two nodes of aws were terminated, thats the reason why the mongodb wasn't picking up a master.

[17:45:11] <rafaelhbarros> joannac: I'm not sure you remember the subject we were discussing.

[17:45:21] <sulo> Hi folks

[17:48:24] <liquid-silence> tripflex I am going back to my data schema design

[17:48:30] <liquid-silence> I think its really wrong

[17:50:00] <tripflex> :P

[17:50:05] <sulo> i'm kinda new to mongodb and have actually only read the o'reilly book about it... but i'm kind of wondering whether mongodb is also a good fit for small applications with only one db server.. in the book is written that you should always use at least 3 db server because of the way the primary server is selected

[17:50:41] <sulo> so i'm kind of wondering whether it is a good idea to start a example application with just one mongo server...

[17:52:44] <kali> sulo: well, it's more or less the same for any kind of DB. you need backups to recover from a total db server crash with any kind of DB, and at least one spare server if you can't afford to be down

[17:52:58] <Nomikos> sulo: we've been deving with a single mongodb server for nearly a year..

[17:53:40] <Nomikos> well, there's the live and the dev server, and both have backups, but it's not distributed or anything yet. not needed yet.

[17:54:21] <sulo> Nomikos: so ho wdo you currently back up your data? is there something like mysqldump?

[17:54:27] <Nomikos> I'd say it's fine for demoing, proof of concept, low loads, learning, ..

[17:54:44] <kali> sulo: http://docs.mongodb.org/manual/core/backups/

[17:54:54] <Nomikos> sulo: tbh I don't know, our db/admin guy takes care of it

[17:54:58] <sulo> kali: thx

[17:55:15] <sulo> i have one more question actually :)

[17:57:36] <sulo> in the book they say that if a record grows it gets moved to an other place which can be slow... as i currently see it (and i'm a big noob on that topic ;) ) the whole point of the records is the ability to make them grow... like saving comments to an post in one document... (else i could also use a join in a relational db) .. is there any good practice to prevent such things or make them performant?

[17:58:32] <kali> sulo: on a post/comment model, you would need a HUGE number of comments per seconds for this to become a problem

[17:59:13] <sulo> kali: well in the book they don't give any numbers or something.. they just say that it is slow

[17:59:25] <sulo> and that you should not do that

[17:59:26] <sulo> ^^

[17:59:59] <kali> http://docs.mongodb.org/manual/reference/command/collMod/#usePowerOf2Sizes and that can help

[18:00:15] <kali> yeah, it's all relative

[18:00:45] <sulo> kali: is there something that tells you what is good schema design in mongo db? maybe a good blog post/article

[18:01:44] <kali> start by having a look at this section... http://docs.mongodb.org/manual/data-modeling/

[18:01:54] <kali> then... well.

[18:02:32] <sulo> kali: thanks for your help :) .. i think i'll just to have to try it out in an actual application

[18:02:55] <kali> identify the most frequent and critical data access patterns, focus on making them fast and efficient

[18:03:06] <sulo> maybe it's a better idea to think about performance because of document movements when performance issues occur

[18:03:25] <kali> a friend of me says "one http request, one mongo query". i think this is extreme, but...

[18:04:05] <kali> sulo: well, usually you need to worry more about reads than writes.

[18:04:27] <kali> sulo: so moving a few documents around, with a typical web load (99% read, 1% write) is a non issue

[18:04:52] <sulo> kali: ok thanks

[18:05:19] <kali> sulo: if you start logging access on each post inside the document, then growing documents is going to be a problem earlier

[18:05:34] <kali> but for comments... no.

[18:06:31] <sulo> ok.. in the book it reads kind of radical .. so i was a bit unsure how big of a problem that realy is

[18:07:27] <kali> it is if you're the nsa, i guess

[18:08:07] <Nomikos> damn those chatty terrorists

[18:08:19] <sulo> ;-)

[18:09:46] <sulo> well thanks for answering the questions :) i'll go now and try it out :)

[18:09:56] <sulo> so have a nice day

[19:04:49] <clarkk> tripflex: I'm no closer to solving this :(

[19:34:03] <clarkk> in this mapReduce example, why does the map function order by cust_id?

[19:34:05] <clarkk> http://docs.mongodb.org/manual/core/map-reduce/

[19:39:04] <joannac> clarkk: It doesn't, it just happens to be ordered.

[19:40:48] <clarkk> joannac: in the diagram there are 4 tables. The third is the result of the map function. Why are the keys (holding the array of amounts) the cust_id key?

[19:41:38] <joannac> because that's what the map function does?

[19:41:59] <joannac> emit(this.cust_id, this.amount)

[19:42:50] <clarkk> ok, thanks joannac

[19:45:04] <clarkk> joannac: I don't suppose you have any ideas how to achieve this result, from a collection like that stored in the d object here... http://pastebin.ca/raw/2466722

[19:46:22] <clarkk> each array needs to contain the whole documents (not just the title field)

[19:47:25] <joannac> Why is your d not an array?

[19:48:13] <joannac> If it was an array you could do it via the aggregation framework I think.

[19:48:23] <clarkk> joannac: d is a collection, as returned by mongoose. Please disregard the fact that it's an object in that code

[19:49:08] <joannac> wait, you want your output to be:

[19:49:31] <joannac> News: [{title: "Google News", category: "News"}, ....]

[19:49:46] <joannac> you want to duplicate the category?

[19:50:08] <clarkk> not necessarily

[19:51:16] <clarkk> it's frustrating - the mongodb people mention pivoting collections, but it's very difficult to find any resources explaining how to do it

[19:51:46] <joannac> So you want to duplicate the output in your pastebin?

[19:51:55] <clarkk> joannac: yes

[19:52:52] <clarkk> but also need to have access to other fields in the collection - not category, but others

[19:53:08] <clarkk> s/collection/document

[19:54:58] <joannac> So just emit all of the others in your value?

[19:55:15] <joannac> your map function has to emit a key-value pair

[19:55:22] <joannac> the key and value can be whatever you want

[19:56:11] <joannac> make your value another document

[19:56:34] <clarkk> joannac: can you have more than one emit in a map function?

[19:58:04] <kali> clarkk: yes. 0 to N

[20:01:00] <clarkk> where does the result of emit end up? In an array?

[20:01:59] <kali> wow, you sound confused

[20:03:08] <clarkk> kali: why don't you unconfuse me? :p

[20:03:51] <kali> i don't know where to start... have you read and understood that ? http://docs.mongodb.org/manual/core/map-reduce/

[20:03:54] <clarkk> what does map using emit pass on to the next step in the pipeline

[20:04:17] <clarkk> yes, as I said just above, that is what I am referring to

[20:05:13] <kali> reduce will receive values for the same key together

[20:05:52] <clarkk> ok, that makes more sense

[20:08:46] <joannac> hi kali :)

[20:09:29] <kali> joannac: hi

[20:12:23] <joannac> how're things in your neck of the woods?

[20:23:52] <liquid-silence> db.assets.find({ parent: null }, $or: [{ "permissions.userid": 2 }, { "owner": 2 }], { name:1 })

[20:24:12] <liquid-silence> how can I only add the "permissions.userid" :2 if the permissions array exists in the document

[20:42:57] <leonardfactory> hi :D

[20:46:34] <leonardfactory> I have a question.. I'm using mongo2.4.6 and when I try to sort a query with $geoWithin, it's very slow.. those are the output from .explain(): http://bit.ly/16bWtQ8

[20:48:15] <leonardfactory> It seems it's not using the index at all, even in the $geoWithin query due to the "S2Cursor". However i tried both with a { radius: -1, geometry : '2dsphere' } compound index and a { geometry : '2dsphere', radius : -1 }

[20:48:16] <leonardfactory> any suggestion? Thank you :)

[21:13:51] <leonardfactory> no one :)?

[21:22:10] <cheeser> what did the explain say?

[21:49:27] <leonardfactory> cheeser: Here you go http://bit.ly/16bWtQ8 :)

[21:49:53] <cheeser> it's not using your index at all it seems.

[21:50:25] <cheeser> what do your query and your indexes look like?

[21:50:53] <leonardfactory> I'm going to upload the query, just one second

[21:51:39] <leonardfactory> https://gist.github.com/leonardfactory/75c677413b02ee67e399 this is the query

[21:52:09] <leonardfactory> Regarding the indexes, I'm using mongoose, building them with: schema.index({ radius : -1, geometry : '2dsphere' });

[21:52:23] <leonardfactory> or schema.index({ geometry : '2dsphere', radius : -1 });

[21:52:45] <leonardfactory> but nothing happens! Probably I'm doing something really bad here, but I can't see it

[21:53:27] <cheeser> in the shell run, db.<your collection>.getIndexes()

[21:54:17] <cheeser> (disclaimer: the geo query stuff isn't my strongest, but i'll try to help)

[21:54:54] <leonardfactory> yep, I checked it and I have one with "name" : "radius_-1_geometry_2dsphere" currently, checked before and even "geometry_2dsphere_radius_-1" didn't work

[21:55:09] <leonardfactory> (thank you anyway :)

[21:55:17] <joannac> leonardfactory: I'd love to see a sample document?

[21:55:23] <leonardfactory> sure joannac

[21:55:47] <cheeser> and the content of that index.

[21:56:40] <leonardfactory> https://gist.github.com/leonardfactory/a7fe76c3df063f8bebc7 this is the sample document

[21:57:31] <leonardfactory> cheeser: how can I check the index content? Sorry, I'm pretty new to mongo

[21:58:33] <joannac> db.<your collection>.getIndexes()

[21:58:54] <leonardfactory> ah, yes, going to paste it :)

[21:59:08] <joannac> Unless cheeser wanted something else I'm not thinking of...

[21:59:44] <leonardfactory> https://gist.github.com/leonardfactory/af9ed8eaae36f0575b3f <- this is the real getIndexes output

[22:02:03] <cheeser> joannac: nope. that's exactly it. :)

[22:02:03] <leonardfactory> the index size, from db.areas.stats(), seems reasonable: "radius_-1_geometry_2dsphere" : 41068048

[22:02:29] <joannac> if you .hint() the index, what happens?

[22:02:55] <cheeser> i'm wondering if the order of those fields matter to a 2d index.

[22:03:14] <leonardfactory> I tried, but the cursor was every single time a `S2Cursor`, no performance boost, nothing

[22:04:23] <leonardfactory> cheeser: I didn't understand if for sorting with a 2dsphere index the sorting field must be placed in the compound index BEFORE the 2dsphere field, however I tried swapping them and it didn't use the index anyway

[22:07:14] <joannac> I tried this yesterday with a 2d index, with the 2d field first, the the non-2d field later

[22:07:23] <leonardfactory> 2d or 2dsphere?

[22:07:30] <joannac> 2d, not 2dsphere

[22:07:57] <leonardfactory> oh, and does it work? Maybe the problem resides in the $geoWithin

[22:08:05] <leonardfactory> but it feels so strange

[22:08:20] <joannac> Yeah. Let me try and spin something up

[22:10:14] <leonardfactory> Good, thank you!

[22:12:51] <leonardfactory> Mhhh joannac cheeser: https://jira.mongodb.org/browse/SERVER-10801 I think this is closing the question, right?

[22:12:55] <leonardfactory> What a shame..

[22:12:56] <joannac> Works for me

[22:13:00] <leonardfactory> oh lol

[22:13:19] <leonardfactory> can I see your explain result?

[22:13:26] <leonardfactory> just to understand what should I look for

[22:13:28] <joannac> Oh, sorry, I have the 2dsphere first

[22:15:10] <joannac> Wait, no I don't

[22:15:20] <leonardfactory> I'm a noob so, can you tell if it is ok to do what I'm trying to achieve (search 2dsphere -> sort results) even having the 2dsphere index first?

[22:16:12] <joannac> From the sounds of it, no.

[22:16:22] <leonardfactory> Mhh.

[22:16:49] <joannac> I can't get mine to use the index at all, evern without a sort :(

[22:16:59] <leonardfactory> yep, my same situation

[22:17:12] <leonardfactory> 2dsphere is using S2Cursor everytime

[22:17:19] <leonardfactory> even without sort

[22:28:49] <leonardfactory> well

[22:29:08] <leonardfactory> I think I'm going to sleep, maybe tomorrow a solution will appear!

[22:29:14] <leonardfactory> Good night guys :)

[22:29:26] <leonardfactory> (At least, from Italy timezone)

[22:29:30] <leonardfactory> bye!

[22:38:15] <Gaddel> if i want something like coll.distinct(), but returning values from multiple keys instead of one key, what is the fastest way to do this? multiple "distinct" queries, or a regular find() with projection operator, or aggregation framework?

[22:51:07] <Gaddel> anyone?

[23:10:31] <jayvee> who here is responsible for http://www.mongodb.com/leading-nosql-database ?

[23:10:39] <jayvee> images on the right hand side are 404'ing

[23:11:21] <cheeser> i'll email some folks who either are or know who is. thanks for the heads up.

[23:21:33] <joannac> cheeser: I pm'd you a question, if that's okay.

[23:28:28] <cheeser> apparently my whitelist is active again so i never got it.

Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 14th of October, 2013