[00:34:27] <sobel> GothAlice: i appreciate the concern, but it's a dev server with a dataset that needed a simple reporting query run against it
[00:35:34] <sobel> GothAlice: i ended up dumping to CSV, importing to postgres, and querying from there. that was much more direct than upgrading mongodb and trying to write the query with aggregate()
[00:36:33] <GothAlice> sobel: I take it your dev boxen don't match production, then.
[00:37:01] <GothAlice> It's terrifying either way, of course. ;)
[00:37:15] <sobel> yeah, i believe they're on at least 2.2 in prod
[00:44:32] <GothAlice> sobel: The joys of virtual machines. My DB primary has 964 days of uptime at this point. :D
[00:44:55] <GothAlice> (Surviving several dom0 hot migrations.)
[00:44:57] <sobel> it's just messy, because there are so many moving parts. it supports wacky/proprietary auth adapters built as apache modules, which may or may not build today
[01:03:09] <quuxman> GothAlice: thanks, you've been quite helpful today :-}}
[01:03:51] <GothAlice> quuxman: Ask almost anyone, it's what I do. Were you able to poke at pymongo internals?
[01:05:14] <quuxman> I just got back from lunch and a meeting. __Cursor_spec just dumps the object structure that I passed to find, which I assume will be exactly the same in the web server version, but I'll double check, and also check more of these __* cursor properties
[01:09:03] <quuxman> oops, I want _Cursor__query_spec, not _Cursor__spec
[01:11:28] <quuxman> But the two _Cursor__query_spec() calls appear to return the same thing in both environments, but only one results in an exception
[01:17:52] <GothAlice> Then it's not query generation. Using the connection in each environment, what's the result of a db.version() call?
[01:18:20] <quuxman> pymongo doesn't have that method
[01:18:32] <GothAlice> It does if you execute the command: http://docs.mongodb.org/manual/reference/command/buildInfo/
[01:18:57] <GothAlice> (db.version() is the mongo shell shortcut for the buildInfo().version call.)
[01:19:51] <GothAlice> (db.command() call in pymongo for that)
[01:25:24] <quuxman> well, using diff I discovered an odd difference (it's the exact same DB connection), but one shows "-fn o-strict-aliasing" and the other "-fno-strict-aliasing" in compilerFlags attr
[01:26:07] <quuxman> and there are a few other flags without spaces
[01:27:06] <quuxman> those are copy and paste errors
[01:27:30] <quuxman> stupid terminal puts newlines where there aren't any on line breaks
[01:29:03] <GothAlice> Well, I'm at a loss. The exception says the server doesn't like it, it's the same server, and same client library. Mind = blown. :/
[01:31:41] <GothAlice> And finally, it may be worthwhile to uninstall the faulty one (pip uninstall pymongo), verify that there are no hanging .pyc files at the path illustrated by the above, then re-installing.
[01:31:46] <GothAlice> Bytecode caches cause insane headaches.
[01:32:26] <GothAlice> (To the point that all of my git repos use a hook to clear all .pyc files on each merge and branch switch.)
[01:32:48] <quuxman> yep same file, I should delete the pyc to just make sure
[01:33:13] <GothAlice> quuxman: pymongo includes C components; just nuking the pyc would be insufficient.
[01:33:26] <quuxman> oh, good to know. So completely delete that folder
[01:37:28] <GothAlice> As an aside (which shouldn't effect this particular instance, but still worthy of mention) — Python has a package manager (pip); it's wise to use it for a variety of reasons. (Notably .pth file registration of installed packages… how namespace packages are handled… etc. Namespace packages extract multiple packages into one subfolder, so rm -rf will delete more than expected.)
[01:38:48] <quuxman> there is a gremlin in my web server
[01:38:57] <quuxman> interestingly it happens with my dev server too, so it's not just apache2
[01:40:09] <quuxman> (except in that one case where I just blew away that dist-packages/pymongo folder :-P)
[01:40:10] <GothAlice> Also, since this is sudo, pip uninstall also handles files installed outside site-packages (i.e. /usr/bin).
[01:40:52] <quuxman> there is a gremlin that hates $hint in my webserver
[01:41:07] <quuxman> actually I tried removing the $hint, and it showed the same error for $query
[01:41:13] <GothAlice> Indeed. (http://cl.ly/image/340D3U1v0g0z is an example of my zsh setup, which tries very hard to tell me that rm -rf is not usually the right answer. ;)
[01:41:22] <joannac> are you sure you're running the same query?
[01:41:59] <GothAlice> joannac: He's compared pymongo's generated query (cursor._Cursor__query_spec()) and they're the same.
[01:42:01] <quuxman> Yeah, I pasted the query into the file that runs it and replaced the dynamic query with the static variable
[01:43:12] <GothAlice> This is an intriguing problem.
[01:46:24] <quuxman> I'm going to look more closely at the query spec. I just looked at the top level, now I'm going to use diff (they are very big queries)
[01:47:44] <quuxman> how do I prettyprint SON objects, like json.dump?
[01:48:26] <GothAlice> quuxman: __import__('pprint').pformat(obj) to return a string, pprint(obj) to just print it.
[01:49:08] <GothAlice> (the __import__ thing is a shortcut to avoid needing a one-off import statement and unneeded variable assignment ;)
[01:50:04] <GothAlice> You might also be able to use Python's built-in difflib to compare them. ;^)
[01:51:56] <quuxman> sweet, I've been wanting python diff for a while
[01:52:46] <GothAlice> with open("somefile", 'w') as fh: \n fh.write(__import__('pprint').pformat(obj))
[01:53:01] <GothAlice> (That'll dump it to a file easily.)
[01:54:51] <GothAlice> Also, I just accidentally ordered delivery in New Jersey, an entire country away. XD
[01:56:08] <GothAlice> Silly companies with identical names and logos and menus… of which only one has online ordering!
[01:56:30] <joannac> GothAlice: someone in NJ is going to get a surprise delivery?
[01:57:11] <GothAlice> joannac: Nah; I filled out the form properly (was surprised it had a place for region and postal code…) and they called, somewhat bemused.
[01:58:49] <quuxman> how do I compare two SON objects with difflib? I can't get pprint to show newlines
[01:59:12] <quuxman> SON objects are not sequences apparently
[01:59:29] <quuxman> so difflib.SequenceMatcher fails
[02:00:17] <GothAlice> Some SON objects are sequences… or really should be. :/ Especially since SON dicts are ordered…
[02:01:09] <GothAlice> https://pypi.python.org/pypi/dict_compare/1.0.2 — apparently difflib needs a bit of help with dictionaries
[02:01:50] <GothAlice> http://protocultura.cl/hg/dict_compare/file/ea8c76e39609/dict_compare.py — or just copy/paste the tiny bit of code that represents (remember the license header ;)
[02:02:42] <GothAlice> Argh… that lib hardcodes the base classes it scans, so it'll need to be updated to recognize SON objects. T_T Lame.
[02:02:47] <quuxman> I don't want to compare dicts, because that will screw the ordering of the SON
[02:03:15] <quuxman> SON objects should really be sequences :'(
[02:03:25] <GothAlice> quuxman: Nah, I was hoping the code was properly duck-typed (using standard __getitem__ calls) which SON dict-like objects expose.
[02:04:07] <quuxman> so all I know is the queries _are_ different, they just appear to the casual observance that they're not
[02:04:44] <GothAlice> I know what you need, here.
[02:05:26] <quuxman> oops, I added 'foo =' to one to import it. They are indeed exactly the same
[02:12:22] <GothAlice> Ah; I may have been mistaken about what you meant by "executing schema changes automatically". XD MIdas is the extreme end of that.
[02:12:40] <quuxman> GothAlice: I would want to create a simpler repro case befor I opened a ticket
[02:14:21] <GothAlice> crocket: https://github.com/secondmarket/mongeez is another.
[02:14:37] <GothAlice> Midas lets you completely avoid having to "rewrite" any documents, though, which gives it an edge on other migration tools.
[02:15:05] <crocket> GothAlice, Does Midas run a daemon?
[02:15:34] <GothAlice> Yes; it sits completely between your client application and the MongoDB server. It proxies requests, rewriting documents as needed to conform to current schemas.
[02:15:53] <GothAlice> (Based on explicit rules and versioning, of course.)
[02:16:26] <GothAlice> crocket: https://github.com/EqualExperts/Midas/blob/master/distribution-template/documentation/Midas-Overview-Guide.pdf?raw=true for a presentation and overview/guide for it.
[02:16:40] <crocket> GothAlice, That doesn't tell me how stable it is.
[02:17:54] <quuxman> GothAlice: If I make a single file repro case, would you try it out?
[02:17:58] <crocket> cheeser, it's a natural state of many people, I think.
[02:19:48] <GothAlice> quuxman: Sure. Since you're involving a web interface, please include a setup.py that pulls in the right dependencies.
[02:19:55] <crocket> GothAlice, Is midas performant?
[02:20:20] <quuxman> GothAlice: not sure how to do tha t properly. It'll depend on the pip package werkzeug
[02:21:45] <GothAlice> quuxman: https://github.com/amcgregor/core/blob/develop/setup.py is an example. The output of "pip freeze" could also work, as a text file.
[02:22:12] <quuxman> god knows what I have in there :-P
[02:22:37] <cheeser> doesn't look like midas supports document merging :(
[02:22:43] <GothAlice> (You can ignore most of that file; all you need is a name, version, packages=, and install_requires. In the "structure" of your test case put your files inside a folder (package) beside setup.py)
[02:23:26] <GothAlice> cheeser: No reason why it couldn't be performant; it's effectively applying patches (since it only ever upgrades documents) to the SON going across the wire.
[02:27:34] <GothAlice> crocket: Even still. Last week one thirdparty e-mailed me for support on a different project, simply because someone at his company used the project I worked on and passed my name along. Ugh. XP
[02:27:57] <crocket> GothAlice, You can provide "free" support if you wish.
[02:28:05] <GothAlice> Most of the time I'd agree. I work for an HR firm… now I see the value of managers.
[02:28:09] <crocket> You can provide low-grade support
[02:28:36] <crocket> GothAlice, At least, don't let the manager have more power than you.
[02:28:56] <crocket> If a manager gets more power than you, it's more of a problem than a solution.
[02:29:10] <GothAlice> I've also learned the value of having a Bus Count™ > 1, but yeah, power needs to be spread around, not hoarded.
[02:29:16] <crocket> GothAlice, A manager is there to support you.
[02:29:41] <crocket> You should make it clear that a manager is like an administrator.
[02:30:41] <crocket> GothAlice, I think it's also good to work for an opensource company.
[02:30:51] <crocket> You end up revealing your name, but that seems to be fine?
[02:32:00] <GothAlice> Developers should be free to Flow, managers should provide support and oversight to that end, I agree. Generally means coalescing user requests into less frequent interruptions, prioritizing tasks, planning future scope, and maintaining schedule, freeing the developer to solve problems. At my work we open-source as we get time to.
[02:32:25] <GothAlice> Basically all of the marrow projects (marrow org on GitHub) are me, but they're often also extracts from work code.
[02:32:46] <crocket> GothAlice, If you work remotely and communicate via asynchronous channels, it's mostly fine.
[02:32:53] <quuxman> GothAlice: I created a single file web server with the same query, and can't reproduce it. There's something nasty burried in my web server code
[02:33:06] <GothAlice> quuxman: Werkzeug, you said?
[02:37:47] <GothAlice> Now I'm very, very curious to see if you can jiggle the code around to identify if it's your code, or some library code causing the issue.
[02:38:31] <quuxman> GothAlice: but it does _not_ repro in the web server right after the DB connection is created
[02:40:39] <GothAlice> crocket: The point to Midas is that your application is completely unaware of the migration process.
[02:40:52] <crocket> If morphia loads an object written by a previous version of schemas, what happens?
[02:40:56] <GothAlice> (That's why it's protocol middleware like that, to make the entire process completely transparent.)
[02:41:12] <cheeser> crocket: depends on the schema change
[02:41:24] <crocket> Can it lead to runtime errors?
[02:41:31] <cheeser> you can use annotations to do simple migrations on loads
[02:41:44] <GothAlice> Query in: Morphia -> Midas (doing nothing) -> MongoDB. Result back: MongoDB -> Midas (compares record version against latest version for that collection, runs series of diffs to catch up) -> Morphia.
[02:42:31] <GothAlice> crocket: Instead of your app connecting to MongoDB, you connect it to Midas. Midas passes queries through, but intercepts the returned data, upgrading it if needed.
[02:42:48] <crocket> GothAlice, You said "Wait, no, go with one or the other, not both."
[02:43:00] <cheeser> crocket: hibernate/jpa or morphia
[02:43:14] <crocket> jpa + flyway or morphia + midas
[02:43:44] <GothAlice> Midas is pretty much the definition of "middleware". ;)
[02:44:05] <GothAlice> (Because it goes in the middle, between your app and the real server.)
[02:44:22] <crocket> GothAlice, Sounds like another moving piece that can fail anytime.
[02:44:36] <GothAlice> crocket: It only moves when you add a new migration to it. :P
[02:45:34] <GothAlice> But that's a "launch failure" (use/implementation) issue, not a "rocket design" (library) one.
[02:46:02] <crocket> GothAlice, An external failure.
[02:46:26] <GothAlice> Basically, don't write bad migrations and Midas should consistently apply them.
[02:46:46] <crocket> GothAlice, How do you write migrations based on changes in morphia objects?
[02:47:05] <cheeser> you don't. you write migrations for documents.
[02:47:48] <crocket> For example, I don't know how to translate "@Reference List<Employee> underlings = new ArrayList<Employee>();" into a document field in midas migration scripts.
[02:48:14] <GothAlice> Specifically "expansions" (adding fields, etc.), "contractions" (removal), and "transformations" (alterations of data itself), stored with Midas.
[02:49:12] <crocket> I'm going to have to face the same problem with JPA, too.
[02:49:17] <GothAlice> One would suppose that line of code of yours would (by itself) produce a document containing a list of db_refs or ObjectIds called "underlings". I.e. {underlings: [ObjectId(…), …]}
[02:50:00] <GothAlice> crocket: But if you don't see how that code translates into the document I gave, you need to read up on your schema/ODM (object document mapper) layer and poke around at your documents in a Mongo shell.
[02:52:58] <crocket> GothAlice, How would I convince my project manager to use midas + morphia instead of jpa + flyway?
[02:53:19] <GothAlice> Are you more familiar with JPA?
[02:53:33] <crocket> I think nobody in my company is familiar with JPA.
[02:53:46] <crocket> GothAlice, I think he's allergic to new technologies.
[02:54:07] <cheeser> attend a mongodb day somewhere
[02:54:37] <crocket> I'm not sure myself if I need mongodb yet.
[02:54:43] <GothAlice> Then of the two technologies, I'd push for the slightly-purer MongoDB one. I don't know your boss, however, so can only really point you at mongodb.com and mongodb.org whitepapers and use case descriptions, plus the Server Density blog MongoDB section.
[02:56:24] <Moogly2012_> my manager is himself a programmer (a rather good one) so he considers what I have to say more, then again I work at a relaxed environment
[02:56:25] <GothAlice> crocket: But the decision *really* comes down to an evaluation of the technologies for the use case you have, not what others may write about other problems. At work, we created microcosm test cases of our problem on several technologies (including Postgres), then beat the crap (pardon my language) out of the servers under our test load. MongoDB won.
[02:56:35] <crocket> cheeser, In github and valve, a manager has no more power than a progarmmer.
[02:57:09] <crocket> GothAlice, The project manager didn't give me enough time for benchmark
[02:57:16] <crocket> The deadline is only emotional.
[02:57:56] <crocket> cheeser, The world is shifting toward open allocation slowly.
[02:59:20] <crocket> cheeser, You, too, have a choice.
[02:59:56] <cheeser> you sound as if you think i'm unhappy at my job. that is not even remotely the case.
[03:00:06] <GothAlice> Not allocating time to properly evaluate components is a Very Bad Sign™ and a leading cause of project maintainability issues, organic complexity growth, and failure.
[03:00:09] <crocket> cheeser, I've mostly defied a corporate officer who had been bossing me around for months.
[03:00:21] <cheeser> to be honest, i really don't care
[03:01:00] <crocket> GothAlice, I'll talk him into giving more time.
[03:01:15] <crocket> GothAlice, The current project is an internal project with no actual hard deadline.
[03:01:30] <crocket> The current deadline is here to prevent me from becoming lazy.
[03:01:48] <crocket> The attitude they take is rude since they think I'm inherently lazy.
[03:01:54] <GothAlice> crocket: Consider it this way; without understanding how a solution interacts with a problem domain, the only correct solution to the question "Which should I use?" is "Potato."
[03:02:41] <crocket> GothAlice, The project manager also rejected "play framework" because nobody except me knows play framework already.
[03:03:09] <crocket> He wants a running web service to be fixed immediately at the expense of adopting increasingly better tools over time.
[03:03:46] <quuxman> GothAlice: finally figuring out the problem doesn't occur when I send the DB command, it occurs afterwards somehow
[03:04:47] <Soop> Anyone mind answering a rookie(ish) question real quick?
[03:05:08] <GothAlice> Soop: Ask, don't ask to ask. What's your Q? :)
[03:06:15] <crocket> GothAlice, What microcosm test cases did you write against MongoDB and Postgres?
[03:06:36] <quuxman> GothAlice: YAY! Turns out the problem occurs when running 'list(cursor)'. This breaks when using $hint, does not without it. However '[r for r in cursor]' always works.
[03:07:04] <quuxman> Takeaway: never, ever ever use list(foo), always use [r for r in foo]
[03:07:48] <Soop> I'm trying to reference between a user in a users collection and a post. Using mongoose. Ive looked at other apps and how they reference and im doing it exactly the same. However when pulling the db up with Robomongo, the referenced array is empty.
[03:08:39] <GothAlice> quuxman: A ha! Brilliant! Now that certainly sounds like something to submit to the JIRA ticket tracker, and sounds like a suitably tiny test case.
[03:09:21] <GothAlice> quuxman: MongoDB's ticket system. See the topic of this channel for links.
[03:09:41] <quuxman> why oh why would different commands be sent on calling list() as opposed to... whatever [r for r in foo] calls? Shouldn't that be equivalent??
[03:09:48] <Soop> From my user model -- posts:[{ type: mongoose.Schema.Types.ObjectId, ref: 'Post' }] and posts model -- author:[{type: mongoose.Schema.Types.ObjectId, ref: 'User'}],
[03:10:18] <GothAlice> Soop: Hmm; alas, I personally can't help you with mongoose issues. Generally "references" like that are just ObjectIds, not lists of references.
[03:13:01] <crocket> cheeser, Do you use midas, too?
[03:13:07] <GothAlice> quuxman: cursor.__iter__ probably just returns self (the cursor), then .next() is called repeatedly. list() probably calls __getitem__ with an unbounded slice, which is handled differently. (I may be completely out to lunch, though.)
[03:13:16] <GothAlice> quuxman: The MongoDB folks could better answer that.
[03:13:36] <crocket> How do you handle automatic schema changes?
[03:13:41] <quuxman> GothAlice: that is a nasty API gotcha
[03:14:29] <crocket> It's just too much trouble to convince my project manager.
[03:14:32] <quuxman> especially considering it magically changes the implemented interfaces when using slightly different query syntax
[03:14:33] <GothAlice> crocket: I don't. I don't need to because a) whatever minor changes I ever need to do I just run in a mongo shell (which is extremely rare), and my ODM layer provides sane simulated defaults for added fields.
[03:14:41] <crocket> I can do what I like after I'm relieved from the job in 3 months.
[03:15:36] <crocket> GothAlice, Does it mean you don't need midas?
[03:16:23] <GothAlice> I neither need, nor want automated migration tools. :)
[03:16:34] <GothAlice> (One of the little joys of careful data architecture and MongoDB.)
[03:16:38] <crocket> GothAlice, Because MongoDB is flexible?
[03:17:01] <GothAlice> MongoDB is "schema-free". It has no limitation on one document in a collection being in any way similar to another in the same collection.
[03:22:28] <GothAlice> Ah, right, jython. I saw "j" and "thon" and didn't read the middle.
[03:22:33] <crocket> Jython is not getting python 3 upgrade.
[03:22:44] <GothAlice> crocket: A non-issue for my use cases.
[03:22:53] <crocket> GothAlice, I'll be a huge issue in 10 years.
[03:22:56] <GothAlice> crocket: My code is Python 2 and 3 native.
[03:23:25] <GothAlice> crocket: Consider that Python 3 has been out for 9 years now.
[03:23:49] <GothAlice> crocket: "10 years" is not a timescale representing a problem for my code or data.
[03:24:30] <crocket> GothAlice, Because your code will be thrown away in 10 years?
[03:26:07] <GothAlice> crocket: No, that 10 years has passed, and by the time the next 10 years have passed I'll simply have a better solution. (My data retention timescale requirements, OTOH, are on the order of 150 years, FYI.)
[03:26:44] <crocket> GothAlice, built to be scrapped.
[03:28:05] <GothAlice> crocket: Built to be pragmatic about reality. My data has already undergone a migration from relational to MongoDB (that happened around the 5 TiB size) and the API layer has been rewritten a half dozen times from scratch over the last 10-15 years.
[03:28:11] <crocket> GothAlice, I'm aiming for 4-5 decades.
[03:28:12] <GothAlice> The data is what matters, not the code.
[03:28:41] <crocket> GothAlice, The base infrastructure can be stable.
[03:29:18] <GothAlice> crocket: Not really. Ref: Y2K issues and the urgency of correcting a mistake made by people who were tasked with explicitly creating systems to last.
[03:33:40] <GothAlice> crocket: The project is called Exocortex and combines my parallel pipeline AI, personal analytics, data archival, metadata processing and indexing, and my Forever Library projects into one.
[03:36:33] <GothAlice> crocket: How long would rebuilding civilization take if you had access to a copy of Wikipedia, Wikibooks, let alone the book library… and would rebuilding not be better if you could also archive a large portion of other creative works such as music and video?
[03:37:48] <GothAlice> Preserving knowledge is my schtick. ^_^;
[03:37:56] <crocket> GothAlice, Are you planning to depart the earth?
[03:38:34] <GothAlice> crocket: Eventually, but only to increase high availability. Earth and Sol are single points of failure.
[03:39:05] <crocket> GothAlice, I prefer a lot of self-sustaining spaceships over terraforming another planet.
[03:39:50] <GothAlice> crocket: I don't need planets, really. Self-replicating databases FTW. I just need to seed them around. (I also plan on uploading…)
[03:40:28] <crocket> GothAlice, You can live on databases.
[03:40:36] <crocket> GothAlice, A database has no life-supporting system.
[03:41:06] <crocket> If you like to hang out with your archive, you could built one into it.
[03:41:45] <GothAlice> crocket: In the end, there will only be four units of currency in existence: time, computational power, electrical energy, and storage space. I'll live in a simulation running on the database, so life support isn't needed. (Oxygen dependency and pressure sensitivity are design flaws.)
[03:41:57] <crocket> GothAlice, There'll also be flesh and blood.
[03:42:10] <GothAlice> crocket: Nope, that stuff will die off naturally.
[03:42:10] <crocket> Not everything will be turned into machines.
[03:42:37] <GothAlice> The choice really does come down to upload and live forever, or die as a mortal. (Possibly both depending on the mechanism of upload.)
[03:42:54] <crocket> GothAlice, You can't live forever just because your mind is in a computer.
[03:43:05] <crocket> A computer doesn't get to live forever necessarily.
[03:43:18] <crocket> GothAlice, You should build a powerful spaceship to shield yourself.
[03:43:35] <crocket> You should also be able to regenerate or rebuild.
[03:44:18] <crocket> GothAlice, However, it'd be good to occupy humanoid bodies from time to time.
[03:44:18] <GothAlice> crocket: My database cluster would be entering school right about now if it were a child.
[03:46:25] <Moogly2012_> maybe not general, I would argue it would have to be based on my thoughts and such
[03:46:34] <crocket> Moogly2012_, A human is a genral AI.
[03:46:36] <Moogly2012_> otherwise, you're just creating a information bot
[03:46:46] <GothAlice> Moogly2012_: That's the "AI project" part of Exocortex. I've already got it doing word associations based on book training data. I've been polishing up the async processing pipeline code to allow for me to develop "senses" for it.
[03:46:47] <crocket> Moogly2012_, your sentience is a general AI.
[03:46:48] <Moogly2012_> except a human isn't AI, they're not "Artificial"
[03:46:58] <crocket> You're a general intelligence
[03:47:24] <GothAlice> Avoids the prejudicial term "artificial".
[03:47:24] <Moogly2012_> GothAlice: what language maintains all of this? Lisp?
[03:47:48] <Streemo> Moogly2012_: a human mind is built to interfacer with our physical body, anyone would go crazy.
[03:47:50] <GothAlice> Moogly2012_: Python, for the most part. I'm working on my own language based on my experience in building these first tools, though. (It's called Clueless.)
[03:50:19] <Moogly2012_> I might of seen it around a while back, but never cared enough for Ruby to play with it
[03:50:34] <GothAlice> Streemo: There is zero reason why an uploaded mind would have to know that it has been uploaded. Run a full simulation.
[03:50:51] <Moogly2012_> but how would it see, smell, and hear the same things
[03:51:03] <Moogly2012_> if you upload a mind it would freak out when everything is "black"
[03:51:10] <Moogly2012_> it would be an awful thing
[03:51:33] <Moogly2012_> imagine all your 6 senses as you know them gone right now
[03:51:35] <GothAlice> Moogly2012_: What is sight, smell, and hearing but neural signals? We've already reverse-engineered the auditory and optical protocols of our brains…
[03:51:41] <Moogly2012_> that would be catastrophic
[03:51:50] <Streemo> GothAlice: so youre saying the mindand body would be housed in the program, so to the mind, there is still a body, and all physics would be programmed in?
[03:52:08] <Moogly2012_> and just how much data does a brain take up anyway
[03:52:10] <Streemo> the program would simulate everything
[03:52:13] <GothAlice> Moogly2012_: But why would the simulation be "black" unless it was programmed that way? I say, hire a better programmer to code up a proper simulation environment.
[03:52:26] <Streemo> it wouldnt jsut have tosimulate a brain, itd have to simulate every feeling and thought too
[03:52:32] <Moogly2012_> so place yourself in a virtual space
[03:52:35] <maxamillion> is it possible to run a query in the mongo shell and have it output the result to a file?
[03:52:50] <GothAlice> Streemo: Feelings and thoughts are chemical and electrical processes that can be simulated. In fact, we do already.
[03:52:53] <Moogly2012_> would be interesting to take a serious gamer and put their mind inside of a competitive game... and see how they are months in
[03:52:55] <crocket> I guess MongoDB would be disastrous for financial transactions.
[03:53:20] <Streemo> GothAlice: now we're trying to use a classical system to simulate quantum mechanics.
[03:53:31] <Streemo> GothAlice: it justs becomes more spaghetti
[03:53:44] <GothAlice> Streemo: That's why I'm taking the functional, not bottom-up approach to developing my shell AI.
[03:53:54] <Streemo> GothAlice: ultimately one would ask, is it even possible to siumulate a physical world ina computer
[03:54:01] <quuxman> GothAlice: It turns out the problem is in my stupid hand rolled Mongo wrapper that returns my own entity objects on calling cursor.next()
[03:54:02] <crocket> GothAlice, Don't worry. The human brain project is on its way.
[03:54:10] <Streemo> GothAlice: that follows every law of physics known.
[03:56:16] <crocket> GothAlice, Do you think your mind can be transferred to a computer in 3-4 decades?
[03:56:29] <GothAlice> crocket: If it is possible to simulate reality in such a way as to be virtually undetectable (though we do have some evidence by way of the planck length being smaller than the smallest actually possible thing… and time has a granular slice size the same way space does…) then it would also be possible to run a simulation from within that simulation. Given that situation, there is no effective way to ever determine if you are running in
[03:56:29] <GothAlice> the "top level" (reality) or not.
[03:57:09] <Streemo> GothAlice: hmmm im doubtful that we have evidence of discrete time/space
[03:57:13] <crocket> GothAlice, You don't have to simulate reality.
[03:57:13] <Boomtime> that has a long line of assumptions after the anthropomorphic argument: "we exist which should not be a statistically significant event, therefore life is common, therefore life commonly evolves technologically, therefore life produces simulations of life, therefore there are more simulations than realities, therefore we should statistically expect ourselves to be a simulation"
[03:57:16] <GothAlice> (And there may be a near infinite number of recursions… one reality, infinite virtual realities. Statistically we are almost guaranteed to be simulated.)
[03:57:17] <crocket> GothAlice, you don't need to run your mind.
[03:57:20] <Streemo> GothAlice: our tech is not at 10^-34 yet
[03:57:22] <crocket> GothAlice, you only need to run your mind.
[03:57:43] <asher^> is there a syntax for returning an array of _ids that match a query, rather than an array of documents?
[03:57:43] <crocket> GothAlice, Why are we still alive?
[03:57:54] <GothAlice> asher^: Aggregate queries could do that.
[03:58:18] <cheeser> asher^: list fields in your projection.
[03:58:28] <GothAlice> asher^: However you could also just limit the returned data to {_id: 1} (second argument to collection.find) and iterate a normal cursor.
[03:59:14] <asher^> ive got that now, then converting that to a flat array in my application code. wondering if i can get the array i want straight from mongo. ill look at aggregation queries again
[04:01:34] <GothAlice> asher^: Doing it ^ doesn't limit the number of results.
[04:01:49] <GothAlice> Doing it as an aggregate query would; you could have no more than 16MB total data returned (by defualt).
[04:03:03] <asher^> thanks. just found the distinct method, it seems perfect for what i want
[04:05:26] <Moogly2012_> what if you gave your "mind" just access to your desktop, and music, so you make yourself code all day
[04:05:47] <Moogly2012_> then you could possibly clone your "mind" and make yourself finish your work quicker by splitting up jobs
[04:06:31] <GothAlice> Moogly2012_: First thing I'll do after uploading is fork a copy of myself for every book ever written, so I can read them all in the time it takes to read one. (git merge of the mind)
[04:10:38] <Moogly2012_> you can produce data out of a physical object
[04:10:58] <Moogly2012_> if I told you the data of a physical object, you may not be able to product that object
[04:11:04] <Streemo> ok then ap hysical object is a representation of data
[04:11:07] <Moogly2012_> and if you could it wouldn't necessarly be the same
[04:11:14] <GothAlice> Streemo: (Aye, it is. Anything physical can be reduced to the data describing the constituent particles and quantum state. The difficulty is getting all of the data accurately and simultaneously.)
[04:11:26] <Moogly2012_> data is a representation of a physical object is how I see it
[04:11:28] <Streemo> right, jsut check all the quantum numbers, example what i was going for.
[04:11:42] <GothAlice> Quantum teleportation works by transmitting the information about a just-destroyed particle to allow you to reproduce it somewhere else.
[04:13:48] <GothAlice> Alas, the theoretical energy requirements of forming stable (and non-microscopic) Einstein-Rosen bridge is somewhat problematical for the Portal approach.
[04:13:50] <Streemo> heh, physics is the art of approximation indeed.
[04:14:08] <GothAlice> (Portal is all about wormhole bridges.)
[04:15:00] <quuxman> GothAlice: list(foo) does the same thing as [r for r in foo] EXCEPT it also calls foo.__len__() between foo.__iter__() and i.next()
[04:15:15] <quuxman> my exception is somehow caused in my wrapping of __len__
[04:17:33] <Moogly2012_> erase all the desires and memories
[04:17:47] <Streemo> you actually can do something close
[04:17:47] <Moogly2012_> would it have really been erased thoug
[04:17:57] <Streemo> MIT has inserted memories into rats.
[04:18:11] <GothAlice> Moogly2012_: Neural surgery for fun and profit. And yes; if you remove a line from a file and git commit / push it, it's deleted. But also recoverable. ;)
[04:18:27] <Moogly2012_> there was a game about this now that I remember
[04:18:33] <Moogly2012_> but it was annoying to play after a while
[04:18:45] <GothAlice> (You could fork a copy of yourself as of a previous point in time to see how "old you" would react/experience something, as an example.)
[04:19:12] <Moogly2012_> old me was too "gangster" for new me to handle
[04:21:41] <Moogly2012_> when people would skype call me (developers usually) they'd be like "woah, who is this", my writing on forums did not correlate with my real life attitude etc
[04:21:46] <quuxman> but here they are, and they've infiltrated modern society, and I like working with them in a lot of ways, and want to improve the situation
[04:22:49] <quuxman> from one perspective they're destroying my (and many other's) life, and from another they've given me all the opportunities I've had for my career
[04:22:52] <Streemo> In a sense, your internet usage led to a forking different version of your mind. but it was still stored in your mind.
[04:23:09] <Moogly2012_> ^ now I'm just my internet self
[04:26:45] <crocket> GothAlice, How do you maintani 25TiB?
[04:26:53] <crocket> It should be quite expensive.
[04:27:20] <GothAlice> crocket: I priced it out; for the cost of cloud storage I could build and maintain an array at home and still have enough money left over to replace every single drive every month.
[04:27:27] <GothAlice> crocket: So at home, it's quite cheap, actually. ;)
[04:27:42] <crocket> GothAlice, How do you safely maintain them?
[04:28:21] <GothAlice> crocket: Yup. Online. All of it is backed up offsite.
[04:28:48] <GothAlice> (I go through several terabytes of bandwidth each month.)
[04:29:28] <crocket> GothAlice, How do you back up?
[04:29:43] <GothAlice> And yes, I costed out this array in cloud storage and three Drobo iSCSI systems were *cheaper*!
[04:30:16] <crocket> GothAlice, I think you wouldn't do periodical total backups.
[04:30:30] <GothAlice> crocket: *cough backblaze cough* Unlimited storage, $5/mo. ;) I'd give you my referral link, but I think that'd be frowned upon. ;^P
[04:30:35] <crocket> It could be mirroring backup...
[04:31:20] <GothAlice> crocket: Backblaze is a hosted (encrypted) offsite backup solution. It's not a live file store, you have to request recovery.
[04:32:06] <crocket> I can't believe it's that cheap.
[04:32:19] <GothAlice> They do file-by-file (striped into smaller chunks for large files like the database stripes) bzip2 compressed and AES encrypted.
[04:32:45] <GothAlice> crocket: https://www.backblaze.com/blog/180tb-of-good-vibrations-storage-pod-3-0/ — my next step up from the Drobos is this.
[04:33:16] <GothAlice> (But the Drobos are basically zero maintenance. A drive fails, it e-mails me and blinks on the rack, I shovel a new drive in.)
[04:35:17] <GothAlice> The drives I'm using are 4TiB Western Digital "Green Power" (low-power) drives. Haven't had a failure on those in the last four years, actually. (Two years ago an unrelated Seagate drive purchased at the same time died…)
[04:36:35] <crocket> GothAlice, There are 6TB HDDs.
[04:36:50] <GothAlice> crocket: I don't push the envelope in densities.
[04:37:11] <GothAlice> Same reason I'm using high-efficiency (and not high-performance) drives.
[04:37:53] <GothAlice> crocket: Are the individual magnetic sites smaller? Then yes, 6TiB HDDs would be inherently less reliable due to physics. ;) (Which can be compensated for somewhat in software, but still.)
[04:41:15] <joannac> maybe I have an old version of pymongo. cur3 = db.blarg.find({ '$query': {}}).sort([('foo', -1)]) doesn't work for me, it doesn't like the $query operator
[04:42:03] <crocket> GothAlice, Is MongoDB reliable?
[04:42:04] <joannac> GothAlice: quuxman: remind me what this test case shows again?
[04:42:36] <GothAlice> Hmm; when using pymongo you should really use the methods for everything (chained building of the query) rather than passing in raw find *command* arguments the raw way.
[04:43:05] <GothAlice> crocket: Quite, yes. That nearly 3-year uptime I quoted earlier was from my primary at work.
[04:43:08] <quuxman> btw, why do indexes and hints have a direction? It still uses the index if it's the wrong direction...
[04:43:38] <GothAlice> quuxman: It's faster to stroll through the index in its natural order (read ahead caching) than reverse.
[04:44:09] <GothAlice> It's a minor implementation detail, but can have a large impact on compound indexes.
[04:44:09] <quuxman> joannac: shouldn't the way I have it work?
[04:44:43] <joannac> quuxman: doesn't work for me "failed: exception: unknown top level operator: $query"
[04:45:25] <GothAlice> quuxman: The issue is more that you're treating pymongo's find() like the mongo shell's. db.blarg.find({}).sort([('foo': -1)]).hint([('foo': 1)]) as joannac mentioned is the correct form.
[04:46:00] <quuxman> joannac: that's my point, is that it creates an exception, when I'm following the pymongo reference docs
[04:46:11] <GothAlice> joannac: I believe that's because in your version the first argument to find() is supplied as the value for $query automatically. Thus you end up with {$query: {$query: {…}}
[04:47:22] <joannac> quuxman: http://api.mongodb.org/python/current/api/pymongo/cursor.html#pymongo.cursor.Cursor.hint <- that's what I followed
[04:47:26] <crocket> GothAlice, How did you afford to buy all those HDDs?
[04:47:46] <GothAlice> crocket: I didn't buy them all at once, and they didn't all start out at 4GiB. (Another nice thing about Drobos… you don't have to match drives.)
[04:47:55] <GothAlice> 4TiB, rather. Darn dyslexia!
[04:49:20] <GothAlice> crocket: You mess up orders of magnitude all the time, like that, and numbers are a PITA. (KB when you mean MB, etc.) I use the Monofur fixed-width font to compensate (each character has a completely unique shape, but my text entry box in IRC uses a standard sans-serif font, so it doesn't help me write, only read.
[04:49:57] <GothAlice> http://cl.ly/image/441h0F3z2T12 < i.e.
[04:50:32] <crocket> GothAlice, I guess I have a little bit of dyslexia.
[04:50:56] <crocket> GothAlice, 4GiB looked like 4TiB.
[04:51:11] <crocket> My brain manipulates my vision.
[04:51:16] <GothAlice> crocket: Third-person metaphorical "you" in what I wrote. And that's just expectation bias. You knew what I meant, so you read what I meant. ;)
[04:53:53] <GothAlice> crocket: What's your native language, if non-English?
[04:54:43] <quuxman> joannac: you're right it's nowhere in pymongo, I got it from MongoDB $hint documentation. You're not supposed to do that in pymongo I guess
[04:54:59] <GothAlice> quuxman: Each driver has its own "natural way".
[04:55:09] <GothAlice> quuxman: pymongo went with the chained method approach.
[04:55:21] <GothAlice> (The query only actually being issued on first iteration.)
[04:55:49] <joannac> GothAlice: to be fair, so did the shell with .hint(), .sort(), etc
[04:55:57] <quuxman> I can't believe I pissed away my entire day figuring that out
[05:05:43] <quuxman> takeaway: read the damn API docs, and don't get them confused with other API docs
[05:06:31] <quuxman> at some point along my stupid train of mistakes, I tried to call .hint() on my cursor object (which happens to be my own wrapper around pymongo cursor) saw it wasn't there, and gave up on that approach, forgetting that I had wrapped the damn cursor
[05:06:32] <GothAlice> Reading the fine manual is always option #1. :) I'm just glad it got worked out at all.
[05:07:43] <quuxman> At some point in my IRC history you'll see me asking something like "why is the hint method missing from Cursor???"
[05:08:42] <GothAlice> quuxman: When wrapping an object that offers chained methods, it's usually a good idea to write a __getattr__ that passes method calls through to the wrapped object, then re-wraps the returned object before finally passing it back.
[05:09:00] <GothAlice> quuxman: Never have a mysteriously missing method again. XP
[05:09:58] <joannac> I think I only kind of understand, but I'm glad we got the the bottom of that
[05:11:14] <Moogly2012_> I dont understand how people can drive with headphones on
[05:11:31] <Moogly2012_> I feel like I'm going to die in a horrible wreck everytime I see someone driving with headphones on
[05:12:31] <quuxman> GothAlice: that would be the smart thing to do. I vaguely remember attempting that and failing, way back when I wrote this wrapper
[05:12:56] <GothAlice> quuxman: Give me a moment to gist something for you.
[05:14:39] <GothAlice> That's p-code-ish, of course (I don't know what your wrapper does or how) but illustrates the point.
[05:15:41] <GothAlice> (You could also just subclass the object instead of wrapping it. You effectively risk attribute name collisions either way.)
[05:17:05] <quuxman> The reason I didn't subclass it is because i wanted to redefine all the methods
[05:17:49] <GothAlice> quuxman: Might I gently suggest you don't do that? https://github.com/MongoEngine/mongoengine/tree/master/mongoengine/queryset is an example of a project that does that. It's not a small amount of code.
[05:18:27] <GothAlice> (That's 1,545 SLoC—comments and blank lines removed—with tests.)
[05:18:41] <quuxman> Well, it's a very simple redefinition. Actually, maybe .next is the only thing I need to redefine
[05:18:51] <GothAlice> (Excluding all of the support code included for that.)
[05:18:51] <quuxman> Because anything that returns the cursor will return the subclass
[05:19:24] <quuxman> all I want is to add a "collection" property which is the type / constructor for the object I want out of next()
[05:22:46] <GothAlice> quuxman: http://api.mongodb.org/python/current/api/pymongo/collection.html#pymongo.collection.Collection.find has a bit of documentation on it (since this is a Cursor factory).
[05:23:28] <quuxman> well, lol. this is ridiculous
[05:24:00] <GothAlice> The adage is true: if you have a good idea, chances are someone else has already thought of it. Little compensation for the effort you've spent, though. T_T
[05:25:19] <GothAlice> (Also why solving a problem encountered is different from solving the problem that created the problem encountered. ;)
[05:25:49] <quuxman> yeah I love deleting out dumb code
[05:26:04] <GothAlice> A good day is any day where my total lines changed is negative. ^_^
[05:26:08] <quuxman> it's always awesome when you can solve a bug by making a small change then deleting a big block of code
[05:26:46] <quuxman> ok, moment of truth, does replacing my Cursor wrapper with a call to find with as_class Just Work?
[05:28:02] <GothAlice> What's the argument spec for the constructor of your document class? I believe it just gets passed the record returned by .next().
[05:29:21] <GothAlice> (And that callable can be a callaback function, not just a class. That's how ODMs like MongoEngine transform documents in one collection into instances of several different model classes.)
[05:31:13] <quuxman> huh, small problem, my entity classes want to know what collection they belong to, so they take an extra argument
[05:31:55] <GothAlice> quuxman: You can use a tiny sprinkle of magic to solve that.
[05:32:49] <GothAlice> Semantic meaning, easy access to the wrapped object, transferrance of docstring and argspec, etc., etc.
[05:32:50] <quuxman> I could use it in decorator form, right?
[05:33:23] <GothAlice> I wouldn't in this case, unless your document class is extremely, extremely specific.
[05:33:45] <GothAlice> But if you're going to decorate the class, just add a class attribute with the collection name instead.
[05:33:51] <GothAlice> (No need to wrap at all in that case.)
[05:34:19] <quuxman> oh doh, I could just move it from the constructor to a class property
[05:34:31] <GothAlice> My documents all include a _cls key which specifies the class to use, FYI. Saves a lot of time. ;)
[05:34:53] <GothAlice> (Well, not all, but 95% do.)
[05:35:08] <quuxman> I've got that on one collection. All the other collections there's a one-to-one with classes
[05:35:40] <quuxman> that's a gnarly join collection that has a class_name property
[05:35:54] <quuxman> and is in fact the one that needs the hint
[05:35:57] <GothAlice> MongoEngine handles that by having a dictionary that gets merged (rather than replaced) on subclass, called meta. (meta = dict(collection='foo')) If you don't specify one, the class's __new__ takes the __name__ and uses that automatically.
[05:36:30] <GothAlice> (Thus most of the time you never explicitly set a collection name. I do, but I'm silly.)
[05:36:57] <GothAlice> quuxman: It's very smooth. And I do craaaazy things with it. (I'm also a contributor; the triggers/events documentation and .scalar() method are mine. ;)
[05:37:03] <Moogly2012_> is mongoengine considered part of the tools recoded in Go?
[05:37:11] <crocket> Does MongoDB have better performance in CRUD operations and search than MySQL?
[05:38:12] <GothAlice> crocket: Depends highly on your code's intent and usage patterns. I stream all HTTP requests and responses from my production webapp into MongoDB and it adds a few milliseconds to each request. MySQL would take tens of tens of milliseconds. OTOH, I never check if the server got the data I tried to insert, let alone if that data was written to disk!
[05:38:43] <crocket> GothAlice, Power outage will hurt you.
[05:39:20] <GothAlice> crocket: I'll only lose between a few seconds and a few minutes of historical data; and those requests (if the DB died) likely don't matter.
[05:39:47] <crocket> GothAlice, If you're running financial transactions, it may matter.
[05:39:49] <GothAlice> crocket: The stream is for replay of request chains in development and customer service live support (we can watch what you do on our site in realtime).
[05:40:47] <crocket> GothAlice, Are you sure you don't need schema versioning?
[05:40:50] <quuxman> I could really use a more semantic DB layer. For example, in almost all cases, I want to add a certain filter when querying a certain collection (a property that indicates whether something is private or publicly listed), and my system just tosses that in for far too many queries where it should only be in one place
[05:41:23] <quuxman> It makes it far to easy to write a bug that makes private posts appear in public lists
[05:41:35] <GothAlice> crocket: If you are careful about how you write your schemas (i.e. avoid renaming and reorganizing things like the plague and/or only add new fields) it's worry-free. If you do rename or delete things, you may need to run a command or two in the mongo shell, but that's a cakewalk 99% of the time.
[05:42:01] <crocket> GothAlice, If you modify your schemas a lot frequently, you are going to need schema versioning.
[05:42:11] <quuxman> crocket: dealing with schema and data migrations is many times more pleasant in Mongo than with MySQL in my personal experience
[05:42:12] <GothAlice> (My "migrations" are .js files containing usually one to three lines I funnel into mongo.)
[05:42:26] <crocket> GothAlice, Are those files versioned?
[05:42:29] <quuxman> crocket: because the DB is designed to not care about records in the same collection having completely different schemas
[05:42:37] <GothAlice> crocket: Nope. I run them manually when needed. Which is only ever once.
[05:43:07] <GothAlice> (Well, thrice. Once in development on test data, once in staging on a clone of production data, then finally in production.)
[05:43:07] <crocket> GothAlice, What did you say an alternative to Midas was?
[05:43:57] <quuxman> crocket: I have a similar-ish system, where each migration is in a python file that uses my standard set of DB tools to run a simple command on everything in the collection. I test this migration on a couple copies of the DB, then I run it on the live DB, and rename the migration from pending_ to 2014-10-15_whatever.py
[05:44:21] <GothAlice> crocket: https://github.com/ModusCreateOrg/mongo-migrate-schema is one, http://vladmihalcea.com/2014/10/17/mongodb-incremental-migration-scripts/ is a post describing several others; Google is your friend here.
[05:46:25] <GothAlice> https://gist.github.com/amcgregor/55c337a14c2ef66dce9a is an example of the single largest "migration" (in terms of SLoC) in my production codebase at work. :)
[05:47:11] <GothAlice> python -m rita.migrate create_twitter # how it gets run
[05:52:19] <GothAlice> crocket: And yeah, I automate it. In a git merge hook: git diff --name-only @{1}.. | grep "migrate/[a-z_].py" | xargs --no-run-if-empty -n 1 python -m rita.migrate # the migrate module basename() and strips extension.)
[05:52:40] <GothAlice> Er, filename(), not basename(). XD
[05:52:44] <crocket> GothAlice, Does mongeez deal with git merge?
[05:52:54] <GothAlice> crocket: No idea how it could.
[05:53:06] <GothAlice> crocket: Git and your brain deal with git merges.
[05:53:21] <crocket> GothAlice, My brain can do it awkwardly.
[05:53:23] <GothAlice> crocket: (I.e. you'd have to write a hook script yourself.)
[05:53:34] <crocket> However, I'd never make a second branch.
[05:53:46] <GothAlice> crocket: pull does a merge in many situations, FYI.
[05:58:46] <GothAlice> My system root-level merge hook to automatically reload system services affected by a pull: for cmd in $(git diff --name-only @{1}.. | xargs qfile -Cq | xargs --no-run-if-empty equery files | grep init.d/ | sort -u); do ${cmd} reload; done
[06:23:55] <GothAlice> Ooh, https://github.com/e-conomic/mongopatch might be nice enough for me to rewrite in Python.
[06:40:02] <quuxman> GothAlice: with as_class=foo, it's calling foo with no arguments
[06:40:55] <GothAlice> quuxman: Not wrapped, right?
[06:41:56] <quuxman> functools.wraps does something I don't quite understand / not what I want I think
[06:42:27] <quuxman> I have: entity_constructor = partial(self.entity, self._col); return self._col.find(spec=spec, as_class=entity_constructor, **opts); # in my find wrapper
[06:43:09] <quuxman> and I get a takes exactly 3 arguments, 2 given exception, which is what happens when as_class is called with no arguments
[06:43:28] <quuxman> when calling as_class with a dict in the debugger, it works as expected
[06:44:20] <GothAlice> Waitaminute. What is the value of self.entity, and if it is a class, what is the exact "def" line for __init__?
[06:45:45] <quuxman> self.entity should probably be called self.Entity. It's the wrapper class for that wrapped collection. Entity has: def __init__(self, collection, doc):
[06:49:47] <GothAlice> https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/helpers.py#L76-L78 (_unpack_response) is where as_class is eventually used, after a call tree originating in collection.find, then cursor.__send_message.
[06:51:22] <GothAlice> (But upstream in that call tree is re-defaulted to the connection.document_class here: https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/connection.py#L93-L94, which is dict. ;)
[06:51:43] <crocket> If I use MongoDB on nodejs, do I need a special ORM tool?
[06:51:52] <crocket> Or do I just need a mongodb driver?
[06:52:05] <GothAlice> crocket: You can use the driver straight. Many do.
[06:52:29] <GothAlice> Others like type conformance (i.e. user.name is always a string) and opt for light-weight schema *validation* libraries on top of that.
[06:53:34] <GothAlice> Some go full-monty and do the ODM (object-document-mapper) thing, with typecasting, additional events, etc.
[06:58:55] <crocket> I'm not sure if mongopatch is nearly perfect or just inactive.
[06:59:47] <GothAlice> crocket: It has a good API. Nice command-line tool. Expressive (and powerful) migration callbacks. Last updated 14 days ago. (That's really not bad.) Backed by a commercial entity. (Yay for commercial open source!)
[06:59:56] <GothAlice> crocket: How could you say no to those puppy eyes?
[07:00:08] <crocket> GothAlice, I looked at commit frequencies.
[07:03:37] <GothAlice> Likely those few commits are coalesced from larger feature branch merges and pruned of sensitive information. They'll be batched from groups of changes upstream in their internal codebase. (That's generally how I do things at work, too.) Side-effect of being backed by a commercial entity. https://github.com/e-conomic — they have *three pages* of projects. http://e-conomic.github.io — It's something they're passionate about.
[07:10:36] <GothAlice> crocket: I take the opposite approach to you; I write excellent code at work, show the code off repeatedly to co-workers in comparison to other things that are out there… and convinced my boss of the merits of open source contribution. (Free labour and recognition being nice, of course, but community, and a sense of obligation to give back to the open source projects we use came in, too.)
[07:13:01] <GothAlice> I'll be adding a dump of the development and production "pip freeze" package lists and the VM worldfile of system-level packages in the near future. :3
[07:26:13] <quuxman> GothAlice: for the life of me I can't figure out as_class. Here's what I came up with, which is far simpler than my stupid former Cursor wrapper: http://pastebin.com/szzsmDxM
[07:35:40] <quuxman> I think my current solution is satisfactory
[07:36:04] <GothAlice> Subclassing (or wrapping) the cursor is the only viable way to do what you are wanting. Thus: "use MongoEngine" ;^)
[07:38:18] <GothAlice> (It does lots of other pretty things, and you can mix-and-match if you want. I do a lot. Like: https://github.com/amcgregor/forums/blob/develop/brave/forums/component/thread/model.py#L108-L121)
[07:39:16] <quuxman> There have been a couple of times recently where I've been bit very hard by Python not being particularly clear where an exception occurs
[07:39:59] <GothAlice> Different libraries have different terminal outcomes. ;) My own libraries try to be very descriptive, sometimes even giving you the solution in the message.
[07:40:14] <quuxman> and both times it has boiled down to iterators calling __foo__ things in opaque ways
[07:40:27] <GothAlice> All of Python operates through __foo__ magic.
[07:41:11] <GothAlice> In Python 3 you get real recursive exception handling. I.e. if in handling an exception another exception occurs, you get to see the traceback for both. It's _incredibly_ useful.
[07:41:13] <quuxman> the last time I had an ifilter, then an islice, then an iteration. It took me literally weeks (probably months) to figure out the problem was several lines above with the ifilter, not the iterator line that was blowing up
[07:41:33] <GothAlice> Eeeeeew. Chaining things like that in Python means you aren't thinking pythonically, usually.
[07:41:35] <quuxman> would that solve that problem?
[07:42:12] <GothAlice> Those things are opaque, yes, but there are better ways to implement them. ifilter is a perfect candidate for a trivial generator function (which will give a beautiful traceback):
[07:42:58] <GothAlice> def odds(src): for i in src: if i % 2: yield i
[07:43:06] <GothAlice> Bam. (Insert newlines and indentation where appropriate.)
[07:43:35] <GothAlice> Pass in something that can't be %'d and you get a clear error and traceback. (for i in odds([None]): pass)
[07:45:02] <quuxman> why does ifilter prevent the clear traceback?
[07:45:48] <quuxman> As someone who enjoys haskell, ifilter seems the much more elegant / natural approach
[07:46:02] <GothAlice> Lambdas are… generally considered ugly in Python.
[07:46:14] <GothAlice> They're the "C" way of processing data.
[07:46:45] <GothAlice> Python has built-in coroutines via generator functions (using yield instead of return), so creating descriptive processing pipelines is extremely easy.
[07:49:24] <GothAlice> Also generator expressions.
[08:01:33] <GothAlice> quuxman: https://gist.github.com/amcgregor/2ddc11f4ed385afe8429 — Here's a sample processing pipeline from work. It's got some nifty code, but it basically works from the bottom up. (__call__ iterates self.jobs iterates self.urls iterates self.sources, with self.jobs doing some parsing of JavaScript and self.urls parsing a sitemap.xml file.)
[08:04:18] <GothAlice> Note how I explicitly avoid callbacks: this avoids mangling tracebacks unduly.
[08:13:10] <quuxman> GothAlice: is Job.objects() a wrapper around pymongo find basically?
[08:13:48] <quuxman> and foo__in=bar is sugar for "foo: {'$in': bar}" ?
[08:14:17] <GothAlice> It's a MongoEngine wrapper around the entire process of building queries. It prepares them beforehand in stages. Part of the syntax is a mapping of keyword arguments to nested dictionary structures.
[08:14:41] <quuxman> about what I imagined. Why I said "basically"
[08:15:48] <quuxman> I shoudl use MongoEngine probably. Writing things like "{'$in': ...}" and "{ '$gt': ..., '$lt': ... }" gets pretty tedious
[08:16:23] <quuxman> but I also like having less layers
[08:16:30] <GothAlice> It also lets you avoid rename migrations by letting you disconnect the object attribute name from the BSON name.
[08:16:39] <quuxman> how would you add a hint to something?
[08:17:52] <GothAlice> For me this means (as you may have noticed from the forum example link above) each key in each document I ever use takes no more than 6 bytes. ;) Same as in pymongo, basically. http://docs.mongoengine.org/apireference.html#mongoengine.queryset.QuerySet.hint
[08:21:41] <quuxman> that's kind of nuts... I guess you just have to use your MongeEngine code to construct queries in a reasonable way
[08:21:52] <quuxman> It always bothered me a bit using descriptive words in collection records
[08:22:20] <quuxman> But never bothered me much, because I assume MongoDB is designed to disappear repeated property names in large collections
[08:22:26] <GothAlice> quuxman: You can always add __raw__ as a dictionary argument to .objects() and .filter() and co to add a personal touch.
[08:22:47] <GothAlice> Or, like the forum example linked way up above now, you can grab a handle to the raw pymongo collection and do things fancy-free.
[08:22:54] <GothAlice> (I.e. in performance critical stuffs.)
[08:23:08] <GothAlice> Though for the most part, MongoEngine is quite good performance-wise.
[08:23:44] <Streemo> GothAlice: how come you always got something useful to say?
[08:25:50] <Streemo> my current theory is that she has hired several under payed workers, at least two orders of magnitude, to sit around and google things for her
[08:26:03] <Streemo> then write the results into her master mongodb
[08:26:30] <GothAlice> If I had these mythical employees, and they google'd it, it'd already be in my dataset. ;)
[08:27:39] <GothAlice> Think of it this way. I get an idea or question or thing I want to know about. I Google it. (Actually Duckduckgo, but whatever.) My cluster eats that HTTP response and chews on it for a while, parallel processing every result on the first page.
[08:27:42] <GothAlice> Now I know about that thing.
[08:28:18] <Streemo> i knew you had a program giving you answers!
[08:29:39] <quuxman> Streemo: pings like this should in your client
[08:29:53] <GothAlice> Streemo: Aye. Push notifications of topical information, with appropriate cooldown between each. I don't trust Google to "personalize" my results, so I do it myself. Like that.
[08:29:55] <Streemo> thought so, but that wouldnt need 1200wpm to read.
[08:30:32] <Streemo> At this point i wouldnt doubt if yo uactually wrote a program to further fish gogle's results, or if you're just good at google searching.
[08:30:34] <GothAlice> Streemo: I read the results that ping, and if I chose to read it the cluster goes off and repeats, potentially highlighting deeper links to follow.
[08:32:56] <Streemo> but thennnnn still no clicking, you DEFNIITELY would need some sort of cache
[08:33:12] <GothAlice> Streemo: Nah; it evaluates candidate links in parallel. I read one, maybe have another text-to-speeching at me. It'll send me another ping when it estimates I've finished reading. (nWords/1200 minutes)
[08:33:40] <GothAlice> I do use techniques for memorization of information, too.
[08:35:03] <Streemo> nah the whole eval in parallel thing
[08:35:10] <Streemo> if you read 1200 wpm you could certainly do that
[08:35:29] <Streemo> which is why - the confusion on my part.
[08:35:56] <GothAlice> (I'm living proof that a separation between software and mind need not be clear-cut. ;^) All powered by Python and MongoDB! Woo!
[08:38:00] <GothAlice> And virtual machines prove that software environments (including whole operating systems) can operate completely independently of the underlying hardware.
[08:38:22] <Streemo> i was getting more at the physical limitations
[08:38:38] <Streemo> wires, how they are sturctured together
[08:39:18] <Streemo> neurons are just so much more complex than circuits (even though you can simplify them as circuits in certain approx.)
[08:39:36] <Streemo> the thoughts that run on neurons
[08:39:43] <Streemo> are different than the programs that run on wires
[08:40:06] <Streemo> a different character, i guess, due to the difference in the undelying structure of the place in which it runs
[08:40:31] <GothAlice> But memory isn't a transient mental state, it's physical wiring changes. The computer modifying itself, literally in this case. Luckily, we recently partially solved that learning-driven rewiring problem. :)
[08:44:43] <GothAlice> We've known *about* memristors for a long time. Around 2000 we actually figured out how the damn things work at semiconductor scales.
[08:44:54] <GothAlice> (Took advances in electron microscopes to do it.)
[08:45:17] <quuxman> is there a way to remove indents from the start of each line in an """ string """ ?
[08:45:36] <GothAlice> That paper is from 2013. ¬_¬
[08:46:19] <Streemo> gotta give props to the guys working in mesoscopic physics
[08:46:31] <GothAlice> https://docs.python.org/2/library/textwrap.html#textwrap.dedent — you're going to want to probably slice off the first line and add it back after dedention, though.
[08:47:31] <GothAlice> Streemo: The current tech relies on oxygen atoms migrating across a bi-metallic bridge, changing the resistance over time based on current applied due to this migration. (We needed to see the migration to confirm the mechanism of action… we really had no idea how they worked, but managed to build them. ;)
[08:47:46] <GothAlice> Streemo: That last one wasn't for you. ;)
[08:48:05] <Streemo> i know, which is why its link frenzy
[08:48:20] <Streemo> parallel processing, though, right ;P?
[08:49:04] <Streemo> do you see little things moving around, almost like little molecules, (but not molecules)
[08:49:12] <GothAlice> (I have MongoDB capped collections acting as message busses, with coroutine pipelines pulling data from some, and pushing data to others.)
[08:49:40] <GothAlice> Streemo: Those are called "floaters". Dust on the surface.
[08:51:07] <Streemo> i ask you this because a friend of mine can see that and she is also crazy like you
[08:51:15] <Streemo> crazy in the i can read 12093481290348 wpm
[08:51:16] <GothAlice> You, sir, appear to have an extraordinary claim about reality. I eagerly await links to supplementary information, possibly evidence and whatnot. ;^)
[08:57:22] <GothAlice> I added to the technique I read about by tying the visual memory to muscle memory by tapping my thumb to either my index or middle finger to represent moving "left" or "right" in the tree. (For example, my latest one has an entrance hallway, left to go to the kitchen, right to go down the hall.)
[08:58:20] <Streemo> that actually sounds like fun
[08:58:23] <GothAlice> (I twiddle when I'm thinking… it works quite well!)
[08:58:36] <Streemo> i bet i can learn this, but itd take me lots and lots of patience and im adhd
[08:59:17] <GothAlice> I have ADD and dyslexia. Took me several years to get it right, and the technique is different for everyone. Videogames helped. My first and second palaces are of the entire Ultima Underworld 1 and 2 maps.
[08:59:40] <GothAlice> Streemo: The muscle memory helps with speeding up recall, rather than going through the whole process of "visually walking down the hall", etc.
[09:01:23] <GothAlice> I even have concepts stashed in the same places you'd pick up backpacks and stuff. It's kinda surreal if I think about it too hard. ^_____^
[09:02:24] <GothAlice> Exactly. Every good boy deserves fudge. A fat german kid in green lederhosen stuffing fudge in his face in my kitchen. Right right right left.
[09:09:16] <Streemo> You should definitely look into the eye phenomneon. I'm pretty sure its the error your brain detects/compensates for when you intercept light.
[09:14:16] <krion> if i add a secondary to my actual (primary + secondary + arbiter) rset i'll have four members, hence tied election
[09:14:28] <GothAlice> krion: Due to a variety of algorithmic reasons it doesn't really matter as long as a) the nodes can communicate with a majority of each-other and/or b) the primary is already established.
[09:33:51] <krion> is there a point to have multiple arbiter ?
[09:34:56] <GothAlice> krion: Yes; geographically diverse data centres, for one.
[09:35:20] <GothAlice> If there is a break between data centres, you don't want an odd number of voters on the other side.
[09:36:16] <GothAlice> (Though sometimes you do, so there are ways of doing that, too.)
[09:37:42] <krion> yesterday we have a situation where we add a primary, a secondary and two arbiter
[09:38:08] <krion> we remove an arbiter, the a secondary, and then the primary is become secondary
[09:38:17] <krion> that's a situation i don't want right now
[09:40:20] <GothAlice> krion: When the queryable servers can't reach a majority of other queryable servers, they go into read-only mode to protect integrity. They don't know if *they* lost their internet connection, or if everybody *else* did, and have no way of knowing if their data is the latest with any semblance of accuracy.
[09:40:57] <GothAlice> What you need is three replicas, one primary, two secondaries.
[09:41:35] <GothAlice> A master:slave setup is a backup/archival setup, not a high-availability one.
[09:46:04] <Avihay_work> any where I should refer to to learn how I should structure my data? I currently have a simple structure, and the online documentation sorta answered my questions, but I'm afraid it won't be enough for long
[09:48:51] <Mmike> Hi. When I do rs.initiate() the node I run this onto automatically becomes PRIMARY. Also, that node is reported/added by it's hostname, not the IP address. How can I force adding/initiating by IP address?
[09:59:52] <PirosB3> Hi all, my MongoDB stops responding after only 132537 entries in the DB
[10:22:56] <PirosB3> quuxman: can I also do that with aggregate queries?
[10:23:07] <PirosB3> see, 80% of my queries are aggregate queries
[10:23:34] <PirosB3> but just to understand, how much does having 1GB ram influence?
[10:25:52] <quuxman> 1GB is pretty limited for a DB server. Given how cheap RAM is, I'd recommend at least 4
[10:26:10] <quuxman> you could spend a lot of time hacking your way around your app to make things work with 1G with the DB size you have, but it's not worth the effort
[10:27:29] <PirosB3> sure, I will do some small optimization
[10:27:41] <PirosB3> but yeah sure hardware limitation is quite obvious here
[15:29:51] <FIFOd[a]> GothAlice: I solved my issue. It was indeed a timeout. The mongoclient node.js say the default is false, but this is not true.
[15:30:22] <FIFOd[a]> var messagesCursor = messages.find({}, {timeout: false}); {timeout: false} is required here.
[15:41:02] <Bitpirate> Hey, i need to insert/update an large array of documents in my MongoDB, do i need to foreach these items and make an insert, or is there something like an bulk insert?
[15:41:40] <Bitpirate> ooh, i forgot to mention that i use mongoose
[15:49:19] <ddod> Mongo noob here: I'm trying to rename all instances of an array element in all documents that have that instance. e.g. doc={tags: [x, y, z]} and I want all instances of z to switch to k. Any help would be appreciated.
[16:38:52] <edrocks> if my admin user in my admin db has userAdminAnyDatabase why cant i use show collections in a new db?
[16:41:31] <edrocks> nevermind i had to use dbAdminAnyDatabase instead
[16:54:30] <appleguru> How do i run a query like this: db.myCollection.find({ "my_key" : "abcdefg", "raw_json.areas.0.array_1.*.array_2.*.value_I_want" : 0})
[17:11:24] <gansbrest> hi guys. I'm getting 'no valid seed servers in list' when trying to connect from nodejs to mongo. I have 3 nodes, one primary and 2 secondaries, do I need to specify all 3 nodes in the library, or just one as in the example?
[18:10:40] <quuxman> appleguru: you're looking for $elemMatch
[18:15:45] <quuxman> edrocks: I've wondered this myself. What driver are you using? I'm using pymongo. If the thread properly exits, it shouldn't be a problem, but if it blocks for some reason, you'll eat up DB connections
[18:16:22] <edrocks> quuxman: Im using mgo(golang driver) I found in the docs it auto logs out after the session is destroyed
[18:16:39] <quuxman> edrocks: of course if the thread is unexpectedbly blocking, it'll probably do this before your db close call anyway, so it doesn't matter. I assume you mean closing the connection. What do you mean logout?
[18:18:11] <edrocks> or if you just need to do some work for a little bit
[18:26:46] <kakashiA1> hey guys, I want to model a one to many relationship with mongose (one stundent can have many classes)
[18:27:59] <kakashiA1> dont know how, does anybody know a good example for that?
[18:28:57] <AireTamStorm> student / classes : separate collections?
[18:28:58] <edrocks> kakashiA1: did you read the docs for one to many relationships? they have a nice example iirc
[18:29:35] <edrocks> you just reference the many ids from an array in your one so your students would have arrays of class ids they attend
[19:23:10] <arussel> I've got: "query not recording (too large)", is there any setting I can change to still see the query ?
[19:30:33] <krion> is a member of replicaset with status "startup2" able to vote ?
[19:31:37] <krion> like i got right now 4 member, 1 primary, 1 secondary, 1 arbiter, and another secondary in startup2 state, can you confirm that i'll not face a "split brain" situation ? (except if i lose two member)
[19:43:15] <MacDaddy> anyone else having trouble on OSX 10.10 getting mongodb to run using launchctl and plists? I can get it running with mongod, but not launchctl
[20:03:12] <appleguru> quuxman: Can you give me an example of how I can use $elemMatch to evaluate every entry in an array?
[22:17:56] <Guest966> I'm doing a create for posts with an author field - but I want the returned value to have a populated author field from a users table?
[22:20:20] <Guest966> Or would I need to do a separate query on callback?
[23:57:46] <Synt4x`> is there something wrong w/this statement? games = db.season_schedule.find({"type":"REG", "$or":[{"year":2013},{"year":2014}]})
[23:59:00] <Synt4x`> the $or part of it more specifically