[01:08:49] <Derick> BurtyB: it's also a good introduction to python really
[01:11:57] <BurtyB> Derick, I did signup when they first announced them but had no clue about python and making it work with apache so gave in when they expected it to be working
[01:12:24] <metasansana> BurtyB, mostly shell so far yes but I believing there are parts related to deployment like clusters and sharding etc.
[01:13:44] <Derick> BurtyB: I didn't think apache has anything to do with it?
[01:15:41] <metasansana> I did the one for nodejs developers as well it was mostly easy.
[01:16:16] <metasansana> In my opinion it barely touched on nodejs specific stuff.
[01:24:13] <BurtyB> Derick, iirc it was actually bottle
[04:07:24] <bin> okay guys here is the question. I have an array of objects. How to specify dynamically (based on the value of some property) to which element of the array to update its property
[05:48:25] <Xzyx987X> there appears to be a rather gaping flaw in ruby's mongo library. say you need to have some code run while mongo is write locked. and say you have to run the code in a critical (exclusive) thread, because in ruby's thread model a global hang will occur if another thread tries to write to the database while it is locked. that would be fine, except say there is a bug in ruby's mongo library that causes queries to hang if excecuted i
[06:02:59] <Xzyx987X> anyone experienced in mongo/ruby feel like looking at this?
[06:17:52] <Xzyx987X> ok, now I've figured out how to get the program to hang when executing a read query after write locking the database, even without excecuting it in a critical thread
[06:53:32] <Neptu> hej someone knows how i can drop a collection from the driver?
[06:54:15] <Neptu> python driver, I mean i do not find any specific method so I might use a command??
[07:11:38] <Xzyx987X> ok, I guess to generalize my question here, what might cause mongodb to hang on a find query when write locked?
[07:13:07] <Xzyx987X> I've isolated all the variables, and have determined that all my issues go away if I don't attempt to write lock the database, but I still don't understand how a write lock would cause a query to hang that only reads
[07:42:49] <mboman> Hi guys. I have some issues when I want to update (upsert) a record. Python code looks like this: result = self.db.malware.update({'sha1': sha1sum}, malware, upsert=True) and the error I get is InvalidDocument: key '$oid' must not start with '$'. Suggestions?
[08:19:09] <liquid-silence> joannac care to point me in the correct direction?
[08:20:21] <Xzyx987X> *sigh*, I don't suppose either of you know why a write lock would cause a read query to hang do you?
[08:20:33] <mboman> joannac, I modified my remove_dots() routine to also remove $
[08:21:51] <liquid-silence> ok guys, I need to build a hierarchy tree, where users have a "container" (similar to your home folder in linux), where you can create folders and sub folders and upload files
[08:22:13] <joannac> Xzyx987X: there's one lock. While the write lock is taken, no reads can progress.
[08:22:23] <liquid-silence> also you need to be able to give access to someone else to access a file/folder, read, write, browse
[08:22:40] <joannac> liquid-silence: I haveno idea where you're stuck.
[08:22:58] <liquid-silence> joannac to be honest the whole schema is the problem currently
[08:23:14] <joannac> Then maybe you should rethink your schema?
[08:23:14] <liquid-silence> and I am still learning queries with mongo etc...
[08:23:30] <liquid-silence> well I am not sure how to translare a sql schema to mongo
[08:33:04] <Xzyx987X> joannac, it sort of is, I'm not really sure how to apply the information to my code though. here are the two queries I need to make run without any write occuring between them:
[08:33:07] <joannac> What's folders? an array or a subdoc?
[08:34:31] <joannac> Xzyx987X: similar principle. Have a collection with a "I'm reading flag". Set the flag (if not set) when you start your read. Any write should check if that flag is set; if so, wait.
[08:34:45] <liquid-silence> it does not exit yet joannac
[08:35:47] <liquid-silence> the files node does not exist
[08:36:54] <Xzyx987X> ok, so I guess a gc flag in my entry collection would do the trick?
[08:37:42] <joannac> liquid-silence: oh duh.you need to quote a string with a . in it". Also you may need an upsert flag?
[08:37:44] <Xzyx987X> but then if I find that it's enabled, from what I understand the only way to know when garbage collection is complete is to keep polling the flag
[08:37:57] <Xzyx987X> that's workable, but not very efficient
[08:42:19] <Xzyx987X> joannac, so is there any way the run a mongo query that will block until a certain condition is satisfied to avoid the polling issue?
[08:43:38] <joannac> Xzyx987X: Erm, no. What if the query gets stuck? You want to take down the whole server?
[08:44:43] <Xzyx987X> well, prefferably it would only block the thread in which it's executed...
[08:45:01] <joannac> Everything will yield: http://docs.mongodb.org/manual/faq/concurrency/#does-a-read-or-write-operation-ever-yield-the-lock
[08:46:45] <joannac> I'm out; you guys can help each other :)
[08:47:47] <Xzyx987X> I'm probably not going to be much help... I've been working with mongo for all of three days now...
[08:59:17] <liquid-silence> why is that not nesting it under the current folder
[09:01:50] <Xzyx987X> haha, wow, I just realized that if I changed the order of the queries, it actually wouldn't matter in this particular case if the data was updated between then >.<
[09:02:04] <Xzyx987X> and now I'm off the bang my head against the wall for an hour...
[09:05:11] <liquid-silence> gah starting to hate this
[09:11:23] <Xzyx987X> you feel dumb? I just spent the past four hours trying to solve a problem that could have been solved by switching to positions of two lines of code
[09:13:48] <jackblackCH> hi, anyone know know how to define the encoding when using mongoexport --csv ? i have special chars which looks bad in my csv
[09:18:58] <Kim^J> liquid-silence: Can you reorganize and use a dictionary instead?
[11:05:11] <Neptu> can mongodb python driver drop a collection or I need to use command instead?
[11:34:57] <Nomikos> leifw: if you remember that search/sort found-count discrepancy I had last week, turned out the index on that field had only indexed about half the documents >.>
[11:35:31] <Nomikos> removing index, all 700 found (sort() wasn't adding docs, the search was just not finding them all)
[11:44:59] <bin> Hello guys! I have a question and it is: Can i use $elemMatch operator when i want to update particular element in an array and if yes , could you give me an example please. Cheers in advance. ( Couldn't find example in google)
[12:01:01] <Nomikos> joannac: is there some way to use the new Text Search for the $match part of the aggregation pipeline? the regular index fails because the field contents are too long sometimes..
[12:01:29] <Nomikos> and therefor only returns about half the documents.
[12:01:43] <Nomikos> if I drop the index $match returns the correct set
[12:08:04] <Nomikos> I'm wanting to use the aggregation pipeline to show found-docs-per-category, but I guess I'll have to run that one without an index then.
[12:08:44] <Nomikos> the actual search results showing up in the main part of the page could then still use the new Text Search
[12:10:36] <bin> joannac: sorry to bother ya i will give you an example .. tried to align them properly. -> http://pastebin.com/95b711kR
[12:28:35] <bin> joannac: got it ... you are the best ;)
[13:03:52] <_Heisenberg_> Hi folks. I have a problem with the readPreferred settings. Even if I use a command like db.products.find().readPref (’primaryPreferred’).count() the operation blocks while the cluster is doing a failover and answers after a new primary is elected. I was expecting that the secondaries answer the query even if the cluster is performing a failover?
[13:04:31] <_Heisenberg_> Same for my node.js application where I set the readPreference in the application of course...
[13:06:09] <_Heisenberg_> My Cluster contains two shards, which consist of 5 nodes each
[13:22:52] <kali> _Heisenberg_: during failover, everything is more or less on hold until the election whatever the read preference is
[13:27:45] <_Heisenberg_> kali: the documentations says something different: "In most situations, operations read from the primary member of the set. However, if the primary is unavailable, as is the case during failover situations, operations read from secondary members." ( http://docs.mongodb.org/manual/reference/read-preference/#primaryPreferred)
[13:28:29] <BlackPanx> does mongodb have to bind on localhost ip ?
[13:28:43] <BlackPanx> or can be only internal ip like 192.168.xx.xx
[13:29:38] <Neptu> hej quiestion about the python driver... I do nto find a method the drop the collection, should I use command or maybe Im mistakend and there is a method on the mongoClient?
[13:30:02] <Neptu> second question is proper to use mongoclient against a replica or or I need the mongoclientReplica ?
[13:31:06] <kali> _Heisenberg_: mmm ok. this does not match my impressions and experience
[13:31:46] <_Heisenberg_> kali: Thats my problem! As I'm using mongo in my thesis I have to make sure that this behaviour is not my fault ^^
[13:33:14] <kali> _Heisenberg_: you may want to try it with another client as the node.js one may not be the most "compliant" of them. try the java one for instance
[13:34:15] <_Heisenberg_> kali: I think that is not needed since I watch that behaviour even in the mongos shell :/
[13:34:34] <_Heisenberg_> I'm just curious if I missed a setting or something
[13:36:02] <_Heisenberg_> I'm trying to set diaglog=3 in my mongos config file for a jira issue but if I put that parameter the restart of mongos fails. suggestions?
[13:40:58] <jyee> and nothing in your error log about the failed start?
[13:41:24] <bin> anyone working with mongodb and java ?
[13:46:45] <Danielss89> when i use find() on my collection i get some documents but i have to type "it" to show more.. can i get it to show all instead?
[15:09:44] <PiyushK> hey .. any considerations / pointers to consider while choosing between mongodb and couchbase .?
[15:21:47] <Nodex> PiyushK : right tool for the job :)
[15:25:17] <PiyushK> Nodex, for personalization service and recommendation engine to store and analyze web usage data ...
[15:25:55] <Nomikos> I just found http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis and .. whoa.
[15:26:13] <Nomikos> ofcourse there are other pages like that
[15:46:16] <theCzar> Mongodb n00b here. I'm working on setting up some scripts to configure data migration for MongoDB from one host to another. With databases like Redis, in the past I have just told one running instance to become the slave of the other, waited for them to get in sync, and then shut off the master. What is the best way to do this in MongoDB? I have only seen a way to configure master-slave replication at startup.
[16:02:57] <jiffe99> there something wrong with this statement? db.orders.remove({'coin':'INF','$lt':{'timestamp':1381766396}})
[16:03:13] <jiffe99> after running it db.orders.find({'coin':'INF'}).pretty() shows entries with timestamps less than that value still in there
[16:04:47] <jiffe99> nm I think I have that backwards
[16:35:57] <clarkk> I have a collection of documents with a "category" field - thus multiple documents have the same category. I need to query the docs so that I get arrays of documents with the same category. Could someone give me a hint as to how I would go about this?
[16:36:13] <clarkk> it's very difficult to know what to google for
[16:41:11] <tripflex> clarkk: give me an example of your schema
[16:42:37] <Nomikos> can aggregation be used in any way with an indexer in front of MongoDB, or do those indexers have something similar to aggregation?
[16:43:17] <Nomikos> clarkk: can you sort by category and construct the arrays in code?
[16:43:47] <Nomikos> something like foreach (results as doc) data[doc.category].push(doc)
[16:44:09] <Nomikos> ..that might work without sorting..
[16:45:39] <clarkk> ok Nomikos - I will try that. Thanks
[16:47:20] <clarkk> hmm, on second thoughts, it doesn't sound right
[16:47:31] <tripflex> clarkk: send me an example and i'll tell you what to do
[16:47:33] <clarkk> to do a foreach, it needs to be in an array
[16:47:47] <Nomikos> clarkk: I was kinda guessing as to what you wanted, sorry
[16:47:58] <tripflex> you can use aggregation to do this
[16:48:00] <Nomikos> if you elaborate a little that will help
[16:48:05] <tripflex> but without knowing your schema i can't tell you how to set it up
[16:49:05] <tripflex> you may not even need to use aggregation but it all depends on how your schema is setup
[16:56:47] <clarkk> tripflex: Nomikos my schema is very simple (although the dataset is much larger than this, obviously) http://pastebin.ca/raw/2466634
[17:06:30] <tripflex> and value is array of categories found
[17:06:36] <Nomikos> clarkk: this might work for you? http://pastebin.ca/2466638
[17:07:00] <Nomikos> the logic is in the code, not in mongodb, so it may not be what you were looking for..
[17:07:26] <Nomikos> the mongodb part would simply be a db.coll.find().. I'm pretty new to the db side of things >.>
[17:08:45] <Nomikos> if you sorted on category, then title, it might be simpler still to use the result set. simply set a var for this_category and compare that to category on each .. whatever it is you're doing with the loop
[17:09:01] <tripflex> nah just use aggregation and addtoset
[17:16:47] <tripflex> you don't really need the "category" field but i added it just in case
[17:16:57] <tripflex> because the _id is set as category
[17:17:24] <clarkk> tripflex: not in my real dataset tho - I condensed it for brevity
[17:17:38] <tripflex> i know, but that's where you should start
[17:17:43] <Nomikos> tripflex: would it do a secondary group on 'category' in your example?
[17:17:44] <tripflex> so learn about aggregation and grouping
[17:18:27] <tripflex> it just sets category to the same thing as _id
[17:18:33] <tripflex> the id is what you're grouping them by
[17:19:25] <tripflex> so like, go through my model, group everything by the _id that equals $category, then add category field with $category, and title field. In title field add to the array in the title of current document
[17:20:27] <tripflex> but just like using find, you can group by multiple fields
[17:20:34] <tripflex> so instead of _id: "$category"
[17:40:41] <tripflex> your using something that i assume will be dynamic as the key
[17:40:53] <tripflex> which will make it difficult if not impossible to run queries on
[17:41:04] <tripflex> if it's dynamic it should be set as a value of a key so you can query the key
[17:44:54] <rafaelhbarros> joannac: hello, friday I had to leave asap. two nodes of aws were terminated, thats the reason why the mongodb wasn't picking up a master.
[17:45:11] <rafaelhbarros> joannac: I'm not sure you remember the subject we were discussing.
[17:50:05] <sulo> i'm kinda new to mongodb and have actually only read the o'reilly book about it... but i'm kind of wondering whether mongodb is also a good fit for small applications with only one db server.. in the book is written that you should always use at least 3 db server because of the way the primary server is selected
[17:50:41] <sulo> so i'm kind of wondering whether it is a good idea to start a example application with just one mongo server...
[17:52:44] <kali> sulo: well, it's more or less the same for any kind of DB. you need backups to recover from a total db server crash with any kind of DB, and at least one spare server if you can't afford to be down
[17:52:58] <Nomikos> sulo: we've been deving with a single mongodb server for nearly a year..
[17:53:40] <Nomikos> well, there's the live and the dev server, and both have backups, but it's not distributed or anything yet. not needed yet.
[17:54:21] <sulo> Nomikos: so ho wdo you currently back up your data? is there something like mysqldump?
[17:54:27] <Nomikos> I'd say it's fine for demoing, proof of concept, low loads, learning, ..
[17:55:15] <sulo> i have one more question actually :)
[17:57:36] <sulo> in the book they say that if a record grows it gets moved to an other place which can be slow... as i currently see it (and i'm a big noob on that topic ;) ) the whole point of the records is the ability to make them grow... like saving comments to an post in one document... (else i could also use a join in a relational db) .. is there any good practice to prevent such things or make them performant?
[17:58:32] <kali> sulo: on a post/comment model, you would need a HUGE number of comments per seconds for this to become a problem
[17:59:13] <sulo> kali: well in the book they don't give any numbers or something.. they just say that it is slow
[19:39:04] <joannac> clarkk: It doesn't, it just happens to be ordered.
[19:40:48] <clarkk> joannac: in the diagram there are 4 tables. The third is the result of the map function. Why are the keys (holding the array of amounts) the cust_id key?
[19:41:38] <joannac> because that's what the map function does?
[19:45:04] <clarkk> joannac: I don't suppose you have any ideas how to achieve this result, from a collection like that stored in the d object here... http://pastebin.ca/raw/2466722
[19:46:22] <clarkk> each array needs to contain the whole documents (not just the title field)
[19:51:16] <clarkk> it's frustrating - the mongodb people mention pivoting collections, but it's very difficult to find any resources explaining how to do it
[19:51:46] <joannac> So you want to duplicate the output in your pastebin?
[20:46:34] <leonardfactory> I have a question.. I'm using mongo2.4.6 and when I try to sort a query with $geoWithin, it's very slow.. those are the output from .explain(): http://bit.ly/16bWtQ8
[20:48:15] <leonardfactory> It seems it's not using the index at all, even in the $geoWithin query due to the "S2Cursor". However i tried both with a { radius: -1, geometry : '2dsphere' } compound index and a { geometry : '2dsphere', radius : -1 }
[20:48:16] <leonardfactory> any suggestion? Thank you :)
[21:52:45] <leonardfactory> but nothing happens! Probably I'm doing something really bad here, but I can't see it
[21:53:27] <cheeser> in the shell run, db.<your collection>.getIndexes()
[21:54:17] <cheeser> (disclaimer: the geo query stuff isn't my strongest, but i'll try to help)
[21:54:54] <leonardfactory> yep, I checked it and I have one with "name" : "radius_-1_geometry_2dsphere" currently, checked before and even "geometry_2dsphere_radius_-1" didn't work
[21:59:44] <leonardfactory> https://gist.github.com/leonardfactory/af9ed8eaae36f0575b3f <- this is the real getIndexes output
[22:02:03] <cheeser> joannac: nope. that's exactly it. :)
[22:02:03] <leonardfactory> the index size, from db.areas.stats(), seems reasonable: "radius_-1_geometry_2dsphere" : 41068048
[22:02:29] <joannac> if you .hint() the index, what happens?
[22:02:55] <cheeser> i'm wondering if the order of those fields matter to a 2d index.
[22:03:14] <leonardfactory> I tried, but the cursor was every single time a `S2Cursor`, no performance boost, nothing
[22:04:23] <leonardfactory> cheeser: I didn't understand if for sorting with a 2dsphere index the sorting field must be placed in the compound index BEFORE the 2dsphere field, however I tried swapping them and it didn't use the index anyway
[22:07:14] <joannac> I tried this yesterday with a 2d index, with the 2d field first, the the non-2d field later
[22:15:20] <leonardfactory> I'm a noob so, can you tell if it is ok to do what I'm trying to achieve (search 2dsphere -> sort results) even having the 2dsphere index first?
[22:38:15] <Gaddel> if i want something like coll.distinct(), but returning values from multiple keys instead of one key, what is the fastest way to do this? multiple "distinct" queries, or a regular find() with projection operator, or aggregation framework?