[00:23:01] <Streemo> is there a best way to make the documents returned by a query in the same sort order as the array used in the $in when querying?
[00:23:19] <Streemo> ive seen a solution using aggregate
[00:24:17] <owen1> GothAlice: isn't bind_ip 0.0.0.0 mean all IP addresses on the local machine? i need all ip addresses in the world, i think.
[00:24:31] <GothAlice> owen1: You're thinking about it backwards.
[00:24:33] <owen1> (i already have the firewalls (security groups)
[00:25:37] <GothAlice> A typical VM has two or three "interfaces": 127.0.0.1 (loopback, lo0), x.x.x.x (public IP, eth0), optionally 10.x.x.x (infrastructure-local, private IP, eth1).
[00:28:16] <owen1> GothAlice: by vm you means EC2 or is it also apply to my laptop?
[00:28:17] <Streemo> it's a dynamic calculation of a user's score based on their location. essentially, users can drop items and gain points for them. a user's score is a function of her x,y coordinates
[00:29:06] <GothAlice> owen1: That three-way arrangement is typical for a virtual machine server, i.e. EC2, Rackspace Cloud, whatever. Your laptop likely has three as well, but they'd be loopback, ethernet, and wifi.
[00:29:24] <Streemo> so i keep scoreboards around the world, and then i aggregate group them to get the users total score at any (x,y) of radius R_o
[00:29:55] <owen1> GothAlice: awesome. thank you again
[00:30:04] <Streemo> but that only gives me a sorted list of user id's base on who is winning
[00:30:27] <Streemo> then i need to use that list to query db.users to atually get the sorted profile info
[00:32:54] <GothAlice> Streemo: Spool the result of the second query into a mapping of user ID to document, then iterate the first (used to get the results in the right order) pulling from that mapping as needed.
[00:33:23] <Streemo> the second query is the end though,
[00:33:36] <GothAlice> Except, because you need the results back in a certain order, it's not.
[00:33:57] <GothAlice> You're going to have to buffer the results and sort/iterate them client-side in the correct order.
[00:34:22] <GothAlice> (Pro tip: don't bother sorting; just iterate and pull them from a mapping in the correct order. It'll be much faster.)
[00:35:42] <Streemo> ugh x.x i am not sure what you mean by pull them from a mapping
[00:35:57] <Streemo> I have the sorted list from the first query
[00:36:24] <GothAlice> MongoEngine provides a method to do exactly this type of thing: http://docs.mongoengine.org/apireference.html#mongoengine.queryset.QuerySet.in_bulk
[00:36:31] <GothAlice> Yup; keep that sorted list.
[00:36:39] <GothAlice> Pass it in the query to get the actual user details.
[00:36:57] <GothAlice> Get the user details and stream them into a mapping/dict/object as id: document pairs.
[00:37:57] <GothAlice> Then, to finally get the results back in the right order, iterate the sorted list and yield (or build a list/array/etc of) documents, pulling from that mapping. for id in user_ids: yield users[user_id]
[00:38:29] <Streemo> ah youre saying do it extnerally?
[00:38:43] <GothAlice> I've been saying the entire time that you _have to_ do this within your application.
[00:38:51] <GothAlice> MongoDB can offer no assistance with this type of shenanigans. ;)
[00:39:18] <Streemo> 1. have sorted list of UserIds. Query db.users for $in that list, return a cursor. Then sort cursor manually based on order in list
[00:39:58] <GothAlice> (Depending on language in use, such sort operations may physically move large chunks of data around in RAM to organize them into a sorted compact array.)
[00:40:16] <GothAlice> A sparse hash table / mapping / dictionary / JS-style object will avoid this.
[00:40:26] <Streemo> so i don't sort, i have a wrongly sorted user array!
[00:48:32] <Streemo> ok so what i said above ^ is what you did in python, so i think i understood you
[00:48:35] <GothAlice> Technically that's a generator expression, too, BTW. (No square brackets. I.e. this is the "yield" approach, as a comprehension. No temporary list is created.)
[00:50:10] <GothAlice> It's important to project down to only the essential fields you need when doing bulk reads like this, though. Without it you'll be transferring potentially a lot of data you'll just be throwing away, and it can have a severe impact on performance.
[02:44:36] <GothAlice> https://github.com/marrow/tags/blob/develop/examples/complex/master.py#L14-L24 (as an example) is valid Python. The only "whitespace" that matters starts line 15—all the rest in that selected area is for style. ;)
[02:46:27] <GothAlice> (The "first fastest" involves simply appending to a list a bajillion times then "".join(ALLTHECONTENT) at the end.)
[02:46:39] <GothAlice> (Which is even more hideous than this.)
[02:47:25] <GothAlice> Neither the line breaking nor indentation used are required; it could be written as one line, or in any other way one might desire.
[02:48:50] <greyTEO> might not be a quick question, but do you use LVm for mongo backups?
[02:49:06] <GothAlice> That's the big difference I found when I switched from PHP. PHP has one request = one instance lifecycle. Python's web apps are typically resident, and respond to many requests. Makes caching (and all sorts of things) a whole different bucket of bolts.
[02:49:24] <GothAlice> I use ZFS snapshots, alas, not LVM snapshots, for my home dataset.
[02:49:43] <GothAlice> Elsewhere I use offsite replication.
[02:50:25] <greyTEO> php would be my language of choice…for better or worse
[02:53:26] <GothAlice> (I developed PHP code for many years in the PHP 2 to PHP 6 range. Amazingly, my PHP 3 client sites are still running effectively unmodified today on modern interpreters.)
[02:55:05] <GothAlice> The 2 code was upgraded for PHP 5.
[02:55:15] <greyTEO> solid as backwards compatible
[02:55:15] <GothAlice> (5 broke a lot of really ancient code. ;)
[02:55:34] <GothAlice> greyTEO: Only because I was hyper-aware of all of the many utterly stupid design decisions, and avoided the pitfalls, like register_globals.
[02:55:42] <GothAlice> The "average" code will not be so armoured.
[02:56:15] <greyTEO> there are a lot of nasty things you can do in php, no doubt
[02:57:30] <GothAlice> example.com/index.php?page=about.php — <? include($_GET['page']); ?> — with url_fopen enabled. I had a client do this. I fired him.
[02:59:57] <GothAlice> Being a vertical market service provider has ensured that every Monday, when I review tickets, a little part of me dies, inside.
[03:03:46] <GothAlice> Well, yeah, I'd recommend LVM over ZFS for most use cases.
[03:04:15] <GothAlice> I had a particular one that required exporting the frozen snapshot as a volume to my desktop for backup purposes. (Back Blaze FTW!)
[03:04:36] <greyTEO> ill let you know how stuck I get tomorrow with python….should be a fun day. (hair loss included)
[04:38:29] <luaboy> how to make mongod run as a deamon service like mysql service ?
[07:54:39] <xaxxon> I have some questions about how to store data "mongo-y". I'm doing some meteor development and if I have a bunch of blog posts and each blog post has a bunch of comments associated with it, how do I do it mogo-y? The way I'm doing it is...
[07:55:01] <xaxxon> A collection of posts and another collection of comments where each comment is a document with the id of the post is belongs to.
[07:55:21] <xaxxon> but that's exactly how I'd do it in SQL... so I don't know if it's "wrong"
[08:10:13] <compeman> this is an example. you can change find parameters ofcourse. this is an example.
[08:24:24] <czardoz> Since NoSQL does not allow joins, what are some best practices for avoiding two queries?
[08:24:45] <czardoz> I mean, say a "User" contains an "Address"
[08:25:30] <czardoz> and if we are trying to filter "Users" by "Address", we'll end up making multiple calls to the DB
[08:25:41] <czardoz> is there any way to avoid this?
[08:32:06] <morenoh149> czardoz: nest the address in the user model
[08:34:49] <czardoz> morenoh149: "Address" is going to be used independantly as well. I mean, without the "User". I could still nest it inside "User", but wouldn't that be unnecessary duplication? Plus, if I do that, I'll have to take care of keeping the nested address in sync with the "Address" collection
[08:43:15] <sybarite> How is the padding factor calculated in the wiredtiger storage engine? Is it the same as the previous MMAP v1 engine or things will be different here?
[09:07:31] <nfroidure> Hi! I'd like to concatenate array in a single array uin a $group instruction of an aggregation pipeline, how to achieve this ?
[09:07:53] <nfroidure> The only thing i manage to get is an array of arrays.
[09:08:24] <nfroidure> I know i could use unwind but was looking for something more "user-friendly"
[10:10:36] <Marbug> when you connect to a slave, and want to write to the mongodb, will the connection be redirected to the master?
[10:11:08] <Derick> Marbug: only if you connect to a slave as part of a replicaset connection
[10:11:30] <Derick> i.e. you must specify replSet=replSetName in your connection string (or equivalent matter)
[12:10:29] <oznt> hi everyone, is there a difference between doing db.eval inside a driver's code and setting a lock, executing some code and removing the lock?
[12:10:50] <oznt> is there a difference between the code snippets here http://paste.debian.net/165701/?
[12:45:03] <sybarite> oznt, I am not sure if I am missing an internal details but both the codes should ideally evaluate to the same query
[12:46:28] <oznt> sybarite, thanks, I also saw that mongodb v3 deprecated the method eval, so it's better no to use it
[13:39:09] <oznt> what happens when 2 clients do find_and_update_one (or in mongo findAndModify) ? Can it be that when they start very near to each other the second one will read the information before the first connection modified it?
[13:49:47] <GothAlice> oznt: Only if your are performing a multi-update.
[13:50:09] <GothAlice> oznt: Singular updates are atomic. From the documentation: http://docs.mongodb.org/manual/reference/method/db.collection.findAndModify/#comparisons-with-the-update-method
[13:50:37] <GothAlice> "When modifying a single document, both findAndModify() and the update() method atomically update the document. See" http://docs.mongodb.org/manual/core/write-operations-atomicity/ "for more details about interactions and order of operations of these methods."
[14:26:41] <greyTEO> when running mongo-connector, I keep getting this error:TypeError: 'Database' object is not callable. If you meant to call the 'disconnect' method on a 'MongoClient' object it is failing because no such method exists.
[15:36:55] <Torkable> is it possible to use both the multi and remove options in a findAndModify?
[16:06:57] <greyTEO> cheeser, do you have experience with mongo-connector?
[16:26:06] <greyTEO> maybe GothAlice has some insight
[16:26:24] <greyTEO> maybe a gist with the error and config: https://gist.github.com/dmregister/27ae6a81ab86442ca9db
[16:44:57] <reynierpm> hi there and morning to everyone can I get some advice with this post at SO http://stackoverflow.com/questions/29519988/execute-query-at-mongolab-site ? I am stucked on this for a while
[16:55:27] <greyTEO> cheeser, someone just opened a ticket with the issue I was having. https://github.com/10gen-labs/mongo-connector/issues/240
[17:33:00] <tmaier> I want to query every document with two conditions: item1==true AND item2==true
[17:33:00] <tmaier> I found out that I need the aggregation function with a few parameters and that it's necessary to use the unwind function but I can't find a query for a simple example like the above. Could you help me out please?
[17:33:29] <GothAlice> tmaier: Aaaand automatically /ignored for five minutes. Do not paste directly into the channel, please. Use Gist or another pastebin service.
[17:35:02] <govg> and /usr/bin has nothing like mongorestore etc
[17:35:22] <tmaier> Hi, I'm new to MongoDB and I've got a problem with a nested document hierarchy like this: http://pastebin.com/DVKkbf8J
[17:35:22] <tmaier> I want to query every document with two conditions: item1==true AND item2==true
[17:35:22] <tmaier> I found out that I need the aggregation function with a few parameters and that it's necessary to use the unwind function but I can't find a query for a simple example like the above. Could you help me out please?
[17:36:37] <GothAlice> tmaier: Checking the IRC log (heh; this auto-ignore thing is terrible, I can't see anything you write for another two minutes or so) pasting three lines at once is still pasting. (It's a rate limit / flood protection thing.) Which item1 and item2? You have two sets, and don't specify in your question.
[17:37:20] <govg> okay, found it, there is a seperate package in Arch Linux, called mongodb-tools
[17:37:32] <GothAlice> govg: Ah, yes, the plight of binary distributions splitting things up. ;)
[17:38:05] <deathanchor> is there a more appropriate channel for asking questions about tokumx? toku and tokumx don't exist
[17:38:19] <GothAlice> deathanchor: I suspect they don't IRC, or just hang out here.
[17:38:39] <GothAlice> They support ludicrously large corporations that need to squeeze the last ounce out of MongoDB.
[17:39:50] <deathanchor> well I got a problem with a new member in my replset. I never seen highestKnownPrimaryInReplSet before in my rs.status() output
[17:41:08] <GothAlice> tmaier: As your structure contains no arrays/lists, $unwind is useless, and an aggregate is overkill for the level of query you are trying to issue. Within a query (i.e. db.collection.find({…})) the separate elements are treated as "and"ed together. db.collection.find({'foo.bar': True, 'foo.baz': True}) — both conditions must be true.
[17:41:09] <tmaier> GothAlice: I just did a few Shift+Return. Sorry, I don't use IRC very often :-)
[17:41:43] <tmaier> GothAlice: I just updated the data structure. http://pastebin.com/e98LgExq
[17:42:13] <GothAlice> Same question: which item1/item2 are you interested in querying against?
[17:43:03] <GothAlice> deathanchor: You may have luck diving through their documentation http://docs.tokutek.com/tokumx/ — their fork does change a fair amount of the status and internal structures.
[17:43:09] <tmaier> GothAlice: Ah, ok, I'll try a simple and-combination.
[17:44:54] <tmaier> GothAlice: I think I have tried that yesterday. I just want to be sure and test it again.
[17:45:40] <deathanchor> GothAlice: thx, their docs aren't that great and I have already read through a lot of it. Nothing found for this case
[17:47:37] <GothAlice> deathanchor: Then a ticket or e-mail their way may be useful. They also seem to have mailing lists: http://www.tokutek.com/support/
[17:49:39] <Shapeshifter> Hi. I have a problem with the scala mongodb driver (casbah). I'm trying to retrieve an attribute, which in the mongo shell looks like this: "priceRange" : [ 0.5, 0.9, "piece" ]. It had been persisted through casbah as a (Double,Double,String). But if I try to do getAs[(Double,Double,String)] I always get a None.
[17:51:01] <tmaier> GothAlice: Okay, I'm so dumb... A simple and operation is the solution. Thank you very much :)
[17:55:09] <Shapeshifter> nvm it seems like casbah cannot extract tuples directly. I have to extract a MongoDBList and extract the elements individually from there.
[18:05:53] <GothAlice> tmaier: Complexity is the mind-killer. It is the little death that brings total oblivion. ;) (To paraphrase a Dune quote.)
[18:07:32] <tmaier> GothAlice: Yes, but it seems to me that the mongodb documentation tries to force me into complexity :D
[18:08:13] <tmaier> GothAlice: ... or I searched for the wrong terms in Google ;-)
[19:57:48] <gabrielsch> I have a collection of category that has a reference to parent (that is another category)
[19:57:56] <gabrielsch> is there any way to count the parents recursively?
[19:58:13] <gabrielsch> and get how many ancestors I have
[20:03:23] <GothAlice> gabrielsch: No, not at all.
[20:04:26] <GothAlice> gabrielsch: MongoDB isn't a relational database, so any time you have a "reference" like that, alarm bells should start going off in your head.
[20:06:50] <gabrielsch> GothAlice, should I have an embed many collection inside category?
[20:07:49] <GothAlice> gabrielsch: https://gist.github.com/amcgregor/901c6d5031ed4727dd2f is an example "taxonomy" (hierarchical structure) model mix-in for Python/MongoEngine. Note that to support the widest range of queries, I'm storing the parent, list of parents, a short name/slug, the coalesced path (name/slug of all parents + current) and a numerical order to preserve sorting.
[20:08:23] <GothAlice> Other approaches include combined nested set + adjacency list models. (Where you have a parent reference, and integer left/right values that count around the perimeter of the tree.)
[20:08:47] <GothAlice> Because MongoDB isn't a graph database designed for hierarchical data like this, all solutions will be relatively ugly.
[20:09:34] <GothAlice> (That example of mine conforms to the jQuery DOM manipulation and traversal API, for consistency sake. ;)
[20:10:46] <GothAlice> Storing a concrete list of _all_ parent references (the "ancestors") would give you the answer instantly without additional queries, of course.
[20:11:27] <gabrielsch> because that is the parent haha
[20:11:46] <GothAlice> Since you know which collection contains your parent objects, you can avoid storing a full DBRef and just use an ObjectId, FYI. It'll take up a lot less space MongoDB-side.
[20:12:03] <GothAlice> But also add to that: {…, parents: [ObjectId("55257f5eaaaec4a32f8b456a")]}
[20:12:21] <GothAlice> By getting the "parents" list back (not just "parent") you can get the depth by getting the length of that list.
[20:13:56] <GothAlice> And also efficiently query "get me document X and all parents" (fetch document X, fetch _all_ parents of document X—two queries regardless of depth.)
[20:16:23] <GothAlice> Storing the list of parents is far more useful than just the depth as an integer. ;)
[20:18:29] <GothAlice> Also easier to update, what with $push and $pull.
[20:18:49] <MadLamb> GothAlice, his schema already have the children, it will allow getting the entire tree, but he needs his tree to have a maximum depth of 3.
[20:20:02] <MadLamb> GothAlice, guess he didnt set children there, but actually there is a children property.
[20:20:31] <GothAlice> Likely that's client-side virtual, and simply issues a db.Category.find({parent: this._id}) call.
[20:21:15] <GothAlice> Pro tip: it's almost never a good idea to store concrete lists of child IDs in the parent. This way lies madness. (I.e. it preserves order… of a list of IDs… that you can't query to fetch and get returned in the correct order.)
[20:22:39] <MadLamb> GothAlice, considering that there is a children, and children would be enough for building the tree, wouldn't be enough storing the depth instead of another list of parents?
[20:23:54] <MadLamb> GothAlice,. db.Category.find({depth: 0}) = all roots
[20:23:57] <GothAlice> MadLamb: A user requests /browse/category/bob and the page needs to display a breadcrumb navigation list to easily browse back up to the parents. Not storing the full list of parents in this example will require recursive application-side querying, i.e. one full query for each level. That's nuts, even if you limit the maximum depth.
[20:24:17] <MadLamb> GothAlice, hmm, didnt thought about that.
[20:24:46] <GothAlice> Trust me, I've been wrangling with heirarchical structures in MongoDB for many years now, to support my CMS. ;)
[20:25:18] <GothAlice> https://github.com/marrow/contentment/blob/develop/web/extras/contentment/components/asset/model/__init__.py#L58-L63 is version 2 of the CMS. (The taxonomy thing is from the in-development version 3).
[20:25:48] <GothAlice> Version 2 chose to use a list of child references. I regretted this rather quickly. ;)
[20:27:35] <MadLamb> GothAlice, what about having both? I"m unsure about how to build the tree without querying the children. (we'll have to load every single category)
[20:28:37] <GothAlice> https://gist.github.com/amcgregor/ee96bbaf2ef023aa235f#file-contentment-v0-py-L110-L114 is version 0's tree storage (using a relational back-end, also regretted that pretty quickly) — using the nested set + adjacency list approach (note the l(eft) and r(ight) fields) suffered from potentially needing to touch every record in the collection in order to insert one record; not good!
[20:30:21] <GothAlice> db.Category.find({'$or': [{parent: null}, {$exists: {parent: 1}}]}) = all roots; you can avoid the $or if you are careful and either always omit the field (use $exists only; not advised) or to always have a value, even if it's null to represent no parent (benefits from indexes).
[20:31:16] <GothAlice> Er, and that should be {parent: 0}, rather.
[20:33:46] <MadLamb> GothAlice, the mongodb doc example uses a procedure to generate the ancestors, we dont like that approach, do you have another option?
[20:33:57] <MadLamb> GothAlice, do you recommend that approach?
[20:34:18] <GothAlice> Store them; to store them, though, first you'd need to do the "dirty" (application-side recursive) approach to build the list for each document prior to writing it out for future use.
[20:34:58] <GothAlice> And update your code to maintain that parents list (see my taxonomy example for the magic $push/$pull invocations to do this).
[20:35:21] <MadLamb> i don't really understand python that much.
[20:35:23] <GothAlice> Waitaminute, I see I gisted the queryset, but not the operations. My bad; I'll update it immediately.
[20:36:49] <GothAlice> https://gist.github.com/amcgregor/901c6d5031ed4727dd2f#file-taxonomy-py-L81 — $push when "attaching" a node
[20:37:45] <GothAlice> https://gist.github.com/amcgregor/901c6d5031ed4727dd2f#file-taxonomy-py-L98 — $pull of the parent list from all children of the node to detach, then on line 103, removal of the parents from the node being detached.
[20:40:18] <GothAlice> MadLamb: Hmm, I remember finding a repo filled with JS tree structure examples somewhere. Let me see if I can dig it up, it may be simpler and/or easier to read.
[20:41:44] <GothAlice> MadLamb: Oh, as a quick note while Exocortex continues to dig, http://docs.mongodb.org/ecosystem/use-cases/category-hierarchy/ — the official example for hierarchical category storage uses "ancestors" ("parents") lists in addition to the singular parent reference.
[20:42:25] <GothAlice> Actually, I'll cancel the search; that article is exactly the process I'm trying to describe. :)
[20:45:40] <MadLamb> GothAlice, i saw that article, i just didnt like the procedure.
[20:47:10] <GothAlice> If you're willing to put up with tens to hundreds of query round-trips being needed to service a single request, then any structure whatsoever will do, even a fully relational one like you were trying. But MongoDB isn't a relational database, so certain design considerations need to be made in your data structures to support the types of queries you need.
[20:47:28] <GothAlice> And, as with the breadcrumb navigation example, there are always queries you'll need that you can't think of at the moment.
[20:48:20] <GothAlice> Thus having several common structures (nested sets, adjacency lists, coalesced paths, etc., etc.) to support different types of queries. And mixing them to support multiple different types of query.
[20:48:42] <MadLamb> GothAlice, i guess i could have the ancestors list, the children list, and the parent. That would achieve every possible query and allow me to build the tree by using the children properties.
[20:49:04] <GothAlice> Using a list of parents (ancestors), for example, lets you quickly find out every descendant (not just child) of a particular category by asking: db.Category.find({parents: ObjectId(…)}) — if X document is an ancestor
[20:49:12] <GothAlice> Really, avoid the children list approach.
[20:49:28] <GothAlice> It's practically useless in terms of querying capability, and only gives the illusion of ordering child nodes.
[20:49:37] <MadLamb> GothAlice, that way i'll have to build the tree.
[20:50:23] <GothAlice> "Give me all nodes sorted by ancestors." — bam, all categories already sorted into the correct order for display based on depth, i.e. as an HTML tree structure of some kind for navigation.
[20:50:57] <GothAlice> In the case of my taxonomy, that's "give me all nodes sorted by ancestors, then the 'sort' order field"
[20:52:22] <GothAlice> The way the docs.mongodb.org tutorial does it, it even saves you from needing to do a second query to generate a breadcrumb list because it caches the name of all parents in the children. This is a further optimization.
[20:53:44] <MadLamb> GothAlice, hmm, i guess i got it now. Even with a children list i would have to query for the nested data.
[20:56:46] <MadLamb> GothAlice, i actually liked the example given by mongo docs, the only thing that bite me was the procedure. I dont like to have inconsistent state in my app and depend on persistence to generate things.
[20:57:22] <GothAlice> Well, you could explore http://www.codeproject.com/Articles/521713/Storing-Tree-like-Hierarchy-Structures-With-MongoD — it lists pretty much every practical way to store trees in MongoDB.
[20:58:48] <GothAlice> It covers the methods, but really, there are so many considerations that going with the MongoDB tutorial approach will likely avoid many pitfalls in the future. (Trees are hard. They might seem simple, but they're really, really not. ;)
[20:59:38] <GothAlice> Don't miss the 10gen article link in the introduction, too.
[21:04:58] <wkennington> scons: *** [build/linux2/c++11_on/release/ssl/use-system-boost/use-system-pcre/use-system-snappy/use-system-tcmalloc/use-system-wiredtiger/use-system-yaml/use-system-zlib/mongo/mongod] Argument list too long
[21:05:00] <wkennington> scons: building terminated because of errors.
[23:18:55] <hemmi> Hey there. I am looking for a database in which I can store processed log events and be queried my user-facing API. Is mongo a good fit for this?