PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 4th of February, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:06:34] <morenoh151> should I be able to do db.users.find({ _id: '54ab08f9e898f4457e2b6138'})?
[00:06:47] <morenoh151> that's my user's ObjectId value
[00:06:51] <morenoh151> what doesn't it work?
[00:08:24] <morenoh151> I have to do ObjectId('54ab08f9e898f4457e2b6138') why?
[00:09:18] <cheeser> '...' isn't an ObjectId...
[00:11:18] <morenoh151> cheeser: what?
[00:11:31] <joannac> a string is not an objectID
[00:11:58] <morenoh151> so given an id string how would you find a document by this id?
[00:12:28] <joannac> db.coll.find({_id: new ObjectId(string)})
[00:12:51] <cheeser> that
[00:14:19] <morenoh151> I'm getting `"ObjectId" used outside of binding context. (block-scoped-var)` lint warning. Do I require('mongodb') in this file?
[00:14:43] <joannac> sigh
[00:15:08] <joannac> what language are you using?
[00:15:37] <joannac> in any case, go google whatever language you're using and "new objectid mongodb"
[00:15:45] <joannac> and figure out how to do it in that lnaguage
[00:15:52] <morenoh151> cool
[00:24:33] <morenoh151> oh cool mongoose has findbyId 👍
[00:54:41] <katfang> Hi! I'm migrating my java mongo driver from 2.11 to 2.13. I figured out how to migrate the credentials/auth, but is there a way to check the authentication before performing operations?
[01:35:14] <cheeser> katfang: you have to perform an operation against the DB first. could be as simple as listing the collections.
[01:36:15] <katfang> cheeser: Thanks. Is there really no other way? Or is there a really "cheap" operation? (I have a lot of collections)
[01:38:17] <joannac> ist the databases?
[01:38:44] <joannac> search in a collection that doesn't exist?
[01:38:51] <joannac> depends what you're testing
[02:51:20] <dlewis> I've got a quick query question, how do I find all documents in a collection where the attribute of my condition is in an array? ex: https://gist.github.com/tetsuharu/25da90501e40fac2e16b
[02:51:48] <cheeser> http://docs.mongodb.org/manual/reference/operator/query/type/
[02:53:17] <cheeser> oh, sorry. i misread that.
[02:57:54] <Boomtime> dlewis: http://docs.mongodb.org/manual/reference/operator/query/elemMatch/
[02:58:09] <dlewis> aha, this is it. thanks Boomtime
[02:58:30] <dlewis> db.users.find({ name: 'Marc Weijmar '
[02:58:33] <dlewis> er, sorry
[05:06:59] <arussel> is there a way to set multiple fields, but one of them only if it is not already set ?
[05:32:11] <Boomtime> arussel: I don't think you can do what you want in a single operation (use two update operations using $exists as a predicate)
[05:32:24] <Boomtime> you may be insterested in this feature request: https://jira.mongodb.org/browse/SERVER-6566
[05:33:30] <Boomtime> alternatively, if you have a reasonable minimum bound you may be able to make use of $min
[05:37:56] <arussel> Boomtime: thanks, SERVER-6566 would be nice. I don't have a bound, those are strings
[09:16:01] <oznt> i have a replica set with mongodb 2.6.4 is it a bad idea adding a new member with >2.6.4 ?
[09:28:45] <josias> hi, i have a colletion with 8.361.936 documents and whant to aggregate 1.381.740 of them. The matching costs 50ms but with the aggregate it needs 2888ms why is it so slow? the command is db.collection.aggregate([ { "$match" : { "$and" : [ { "id.site" : 1 }, { "id.date" : { "$gte" : ISODate("2015-01-25T23:00:00.000+0000") } }, { "id.date" : { "$lte" : ISODate("2015-02-03T22:59:00.000+0000") } } ] } } ,{ "$group" : { "_id" : 1, pi : {
[09:30:51] <josias> it gets worse the more i add some properties to agregate (5224ms if i aggregate Year, Month & Day)
[09:43:56] <stark_> Hello Friends.....I am facing problem with mongo start
[09:44:16] <stark_> To run mongo every time I need to first run this command - sudo mongod --dbpath=/var/lib/mongodb
[09:44:43] <stark_> In next terminal I can use mongo
[09:45:03] <stark_> If I close the previous instance running sudo mongod --dbpath=/var/lib/mongodb
[09:45:08] <stark_> mongo stops too
[09:45:18] <stark_> let me know how could I solve this
[09:45:59] <stark_> common thing prompts everytime ie warning: Failed to connect to 127.0.0.1:27017, reason: errno:111 Connection refused
[09:46:15] <stark_> I removed the mongod.lock file as well
[09:46:37] <stark_> but still I need to run these two commands to access mongodb shell
[09:46:53] <stark_> any hint for this ??
[10:17:53] <josias> here is nobody to help :(
[11:55:52] <dsirijus> StephenLynx: hey, tried running "mongod --dbpath=/var/lib/mongodb" without sudo?
[11:55:55] <dsirijus> oops
[11:56:04] <dsirijus> wrong nick
[11:56:31] <dsirijus> a question - does tree depth of document has much performance implications?
[11:57:14] <dsirijus> say, should i have {spell:{book:Array, points:Number}} or {spellbook:Array, spellpoint:Number}, i.e.?
[11:59:46] <kali> {spell:{book:Array, points:Number}} is fine. nested arrays, on the other hand can make some request difficult
[12:01:23] <dsirijus> kali: yeh, was just reading up on that. ideally, i should have a fixed schema, ideally fixed array length preallocated?
[12:01:53] <dsirijus> array preallocation will increase the size tho, even though it should increase speed
[12:02:02] <dsirijus> not sure how to guesstimate that though
[12:02:12] <dsirijus> (i'm a bit under-level to do performance testing for that)
[12:03:34] <kali> unless you plan on growing documents a lot, it's not a problem
[12:05:05] <dsirijus> kali: "a lot" being defined relative to original document size?
[12:05:09] <dsirijus> or some fixed numbers?
[12:06:48] <kali> more like "often"
[12:07:09] <kali> and i'm also aware these answers are not satisafactory :)
[12:08:39] <dsirijus> :)
[12:10:20] <kali> i mean, if you plan on having 1M documents in a collection, and 1000 events/s to be recorded with $push, then yeah, you are in trouble
[12:21:48] <dsirijus> kali: i'm on mongoose. $push would mean inserting new document or replacing the old one in entirety?
[12:22:42] <dsirijus> um, probably inserting array element. :)
[12:22:53] <dsirijus> yeh
[12:30:47] <StephenLynx> no need to have a fixed size array
[12:31:08] <StephenLynx> just don't use nested arrays for things that you expect to grow forever, dsirijus
[12:31:16] <StephenLynx> because eventually you will hit the 16mb limit.
[12:32:03] <StephenLynx> In my project I changed how posts in a thread are stored and how bans in a forum are stored too.
[12:32:27] <StephenLynx> because if I kept them on a sub array, a thread would have a post limit and a forum would have a limited number of active bans
[12:34:25] <dsirijus> hm. ok. good. thank you
[13:07:00] <dsirijus> hey, um, another question - does mongo keep all its databases and all its collections in memory?
[13:07:57] <dsirijus> (thinking about splitting the db into two, one needed for actual application running, and other for sort of internal logging thingie)
[13:10:31] <dsirijus> hm, maybe i should pick another solution for these purposes
[13:54:27] <StephenLynx> no, it doesn't.
[13:54:37] <StephenLynx> dsirijus
[13:55:05] <StephenLynx> and a mongo for loggin sounds a good idea.
[13:55:09] <StephenLynx> not much relational work
[14:04:25] <StephenLynx> does the driver for node supports io when building?
[14:04:40] <StephenLynx> I know it can't build the native BSON module for pre versions
[14:04:48] <StephenLynx> unless you specifiy the install folder
[15:32:25] <GothAlice> StephenLynx: In my own forums software I resolved the 16MB limit differently. First, I evaluated the existing (phpBB-based) dataset, and discovered that all threads and all replies to all threads would easily fit within a single document (so not really a problem), but while still embedding replies into a thread, if the thread grows too large it automatically creates a new "continuation" thread. 16MB of text, though, amounts to ~3.3 million
[15:32:25] <GothAlice> English words.
[15:33:03] <GothAlice> For anything that needs to grow very rapidly, and where I can offload "processing" of the data that is streaming in, I use a capped collection. These are very efficient to insert records into, being a ring buffer.
[15:33:12] <StephenLynx> GothAlice a valid approach, too "hackish" for me, though.
[15:33:27] <GothAlice> With 14,000 users, we have yet to encounter a continuation. ;)
[15:33:46] <GothAlice> (The average response size is 30 or so words.)
[15:33:49] <Derick> 16mb is a lot of text
[15:34:16] <StephenLynx> it may not be put in use, but all the code and rules are still there.
[15:34:29] <GothAlice> With that average reply size each thread could store ~111,000 replies before requiring a continuation.
[15:35:45] <StephenLynx> another problem I found out is how limited it is to work with embedded arrays in mongo.
[15:37:04] <StephenLynx> and how you have often to perform additional queries or unwind arrays.
[15:37:05] <GothAlice> https://github.com/bravecollective/forums/blob/develop/brave/forums/component/thread/model.py#L99-L195 < these are the thread reply management functions in the original fork of my forums. Working with complex nested structures becomes easier when you realize you can give the embedded documents their own IDs. :D
[15:37:18] <StephenLynx> oh, I do that.
[15:37:22] <StephenLynx> always did.
[15:37:40] <StephenLynx> it doesn't solve the lack of tools, like sorting or splicing.
[15:37:49] <GothAlice> Uhm…
[15:37:55] <GothAlice> Slicing is totally a thing.
[15:38:08] <StephenLynx> that, slicing.
[15:38:12] <GothAlice> https://github.com/bravecollective/forums/blob/develop/brave/forums/component/thread/model.py#L175 "slice first" and https://github.com/bravecollective/forums/blob/develop/brave/forums/component/thread/model.py#L188 "slice last"
[15:38:49] <StephenLynx> you can't slice sub arrays on aggregate.
[15:39:30] <StephenLynx> I don't know python nor the python driver. what are you exactly doing there?
[15:39:44] <StephenLynx> did you just took the data from mongo and sliced on the code?
[15:39:53] <GothAlice> No.
[15:39:59] <GothAlice> Those lines generate MongoDB queries.
[15:40:06] <GothAlice> .first() executes the built up query.
[15:40:20] <GothAlice> It's simply using: http://docs.mongodb.org/manual/reference/operator/update/slice/
[15:40:38] <StephenLynx> it isn't slicing a sub-array, right?
[15:40:42] <GothAlice> Sorry, http://docs.mongodb.org/manual/reference/operator/projection/slice/#proj._S_slice
[15:40:45] <StephenLynx> on an aggregate pipeline.
[15:40:46] <StephenLynx> is it?
[15:41:09] <GothAlice> $slice is a standard projection operator. No reason I can think of preventing it from working in an aggregate.
[15:41:16] <StephenLynx> heh
[15:41:18] <StephenLynx> me neither.
[15:41:20] <GothAlice> I simply have no need for aggregates in that Forum software.
[15:41:22] <StephenLynx> but it is liek that.
[15:41:32] <StephenLynx> lemme get my source.
[15:41:46] <StephenLynx> i remember very clearly you can't slice sub arrays on aggregate.
[15:42:42] <StephenLynx> https://jira.mongodb.org/browse/SERVER-6074
[15:42:43] <StephenLynx> here
[15:42:50] <GothAlice> A ha.
[15:43:18] <StephenLynx> and of course theres the sorting, that requires you to unwind the thread.
[15:43:21] <StephenLynx> the array*
[15:43:35] <StephenLynx> not that you never use sub arrays
[15:43:47] <StephenLynx> you are just more limited when you do so.
[15:44:28] <StephenLynx> and you will either have a limit or will have to make a work-around to avoid said limits.
[15:44:45] <GothAlice> Waitaminute. $unwind, $skip, $limit…
[15:44:50] <StephenLynx> in my forum system I am ok with some limits.
[15:45:11] <cheeser> this isn't for a sporting goods store then? :D
[15:45:12] <StephenLynx> I don't make work arounds for them
[15:45:37] <GothAlice> $unwind, $skip, $limit, $group back on _id… that doesn't look so much like a workaround as an actual solution.
[15:45:38] <StephenLynx> but the ones I was not ok was post limit on threads and ban limits on forums
[15:45:46] <GothAlice> Since you're unwinding to sort anyway.
[15:46:01] <StephenLynx> what if I want data from the document but the array is empty?
[15:46:11] <StephenLynx> the unwind will take the data from the document away.
[15:46:14] <StephenLynx> right?
[15:46:20] <GothAlice> I'd have to test it to see.
[15:46:30] <StephenLynx> been there, done that.
[15:46:42] <StephenLynx> if you unwind an empty array you end up with nothing.
[15:47:03] <StephenLynx> and even with that, unwinding and doing all that impacts performance and ram usage.
[15:47:23] <StephenLynx> valid, of course. but it comes with costs.
[15:48:15] <GothAlice> Basically what I'm saying, having written forum software that works quite effectively at some truly ludicrous scale and doesn't seem to encounter the problems you are describing, at all, is that you seem to be making the problem harder on yourself. Re-normalize your data to simplify the queries (apparent "data duplication" may be beneficial simply to make querying easier) and accept that certain techniques aren't hacks, like the idea of
[15:48:15] <GothAlice> thread continuations.
[15:48:41] <GothAlice> Notably they aren't really a hack because you'll realistically never need to use it. (3.3 million words per thread…)
[15:49:04] <StephenLynx> I do have data duplication.
[15:49:58] <StephenLynx> and basically what I'm saying, you are more limited when working with sub-arrays because mongo simply can't do some stuff with sub-arrays.
[15:50:05] <GothAlice> Sure.
[15:50:09] <StephenLynx> if you are ok with these restraints, then its fine.
[15:50:09] <GothAlice> Right tool for the job and all.
[15:50:31] <GothAlice> Relational data restricts you to the data design of a spreadsheet. That's more restrictive, IMHO.
[15:50:33] <StephenLynx> if you are ok with work arounds for limits that you won't reach is fine too.
[15:51:17] <StephenLynx> restrictive in what sense? performance?
[15:51:33] <GothAlice> How you are forced to think about and structure your data.
[15:52:15] <StephenLynx> how are you not forced to follow a structure for your data when you use sub-arrays?
[15:52:38] <StephenLynx> you still has to be consistent, no matter the approach.
[15:52:43] <GothAlice> Nothing is forcing you to take that approach.
[15:53:09] <GothAlice> Saw one development team give a presentation on their Awesome Django App™ at a conference once, and spent the entire presentation with my head in my hands in disbelief. They had a relational table with five of each of the basic datatypes (str1, str2, …, int1, int2, …) and an internal mapping for different record types. Instead of, I dunno, using a schemaless database that can handle inheritance patterns better.
[15:53:40] <StephenLynx> I can't see the relation with what I am saying here.
[15:53:42] <GothAlice> Entity Attribute Value was invented to have a relational database act like a document one.
[15:54:51] <Derick> EAV, shudder
[15:55:21] <GothAlice> StephenLynx: I don't quite grok your complaints. Yes, no matter what you pick you will have to understand the tool to best use it. This shouldn't be a surprise or point of contention. I've demonstrated forums that don't exhibit the problems you describe at all as a counter-point to your list of "problems" and "hack solutions". Practicality beats purity!
[15:56:17] <StephenLynx> In my defense, I never said the limitation of mongo were absolutely a problem. I said that for me, I rather not use sub-arrays for the sake of purity.
[15:56:17] <GothAlice> (And pre-aggregation of statistics is one highly useful way to avoid needing complex aggregate queries when the time comes to display those stats.)
[15:56:43] <GothAlice> StephenLynx: So you then introduce race conditions due to treating MongoDB like a relational database and faking joins client-side.
[15:56:45] <StephenLynx> I also pre-aggregate some stuff, like post count.
[15:56:55] <GothAlice> So, certainly a trade-off, and not nearly as "pure" as you might think.
[15:56:56] <StephenLynx> no, I don't fake joins.
[15:57:18] <StephenLynx> when I query for posts I just look for all posts that matches both the forum and the thread.
[15:57:50] <StephenLynx> I don't look for pieces of data in another collection and add it to my data.
[15:57:53] <GothAlice> "Give me thread X and all replies to thread X." — fake join requiring a findOne for the thread, and subsequent separate find for the replies.
[15:58:21] <StephenLynx> it is a simple where x = y
[15:59:01] <GothAlice> Yup, likely using IDs as a _fake join_. Here's an example of something: when you delete a thread, do you keep the replies around?
[15:59:19] <StephenLynx> no, I delete all posts that matches said forum and thread.
[16:00:03] <GothAlice> So there exists a time at which replies exist, but the linkage to the thread is broken because the thread has been deleted, but the replies haven't been yet. (Or the reverse, depending on what order you do things.)
[16:00:14] <GothAlice> http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html may be a worthwhile read.
[16:00:37] <StephenLynx> will read.
[16:01:13] <jenner> heya
[16:02:13] <jenner> guys, is there a way to fix "Document Error: key for index {...} too long for document: {...}" without deleting the document as suggested in the upgrade faq?
[16:02:13] <GothAlice> Admittedly, I, too, do the "fake join" thing at the Forum -> Thread level. Threads are their own collection (nesting replies within them), but individual Forums are, indeed, their own thing. Double nesting has even greater restrictions (notably on $elemMatch), so I don't go that far.
[16:02:57] <StephenLynx> one thing that caused me to move threads to a different collection what pinned threads
[16:03:29] <GothAlice> https://github.com/bravecollective/forums/blob/develop/brave/forums/component/thread/model.py#L47-L50 < I support several flags.
[16:03:38] <StephenLynx> that required sorting by a condition and sorting by time
[16:03:58] <StephenLynx> and how do you sort pinned threads?
[16:04:20] <GothAlice> I do two queries. Step 1: find all sticky threads in the current forum. Step 2: find everything else. That's the "pragmatic" approach to the problem, rather than creating a convoluted query to do it all in one.
[16:04:25] <StephenLynx> oh, you also have a separate collection for threads,right?
[16:04:34] <StephenLynx> yeah, I also do that.
[16:04:57] <StephenLynx> I didnt said I didn't supported it, I said it caused me to move to a separate collection
[16:05:05] <GothAlice> Threads is the primary collection of the entire app. 99% of the data exists there.
[16:05:39] <GothAlice> Maybe two or three dozen forums organized into a simple hierarchy, and some user mapping stuff. (These forums use an external authentication/authorization provider.)
[16:06:19] <StephenLynx> yeah, your threads are pretty much like mine.
[16:06:36] <StephenLynx> except I keep posts separate because I didn't wanted to deal with the 16mb limits on threads :v
[16:07:02] <GothAlice> Except that until your users write 3.3 million words within a thread, you won't have to deal with the 16mb limit on documents.
[16:07:05] <GothAlice> So, non-issue.
[16:07:51] <Nilium> I hate dealing with the 16mb limit because I wouldn't be hitting it if people before me hadn't made awful decisions on DB structure ಠ_ಠ
[16:08:04] <GothAlice> And creating a new thread, cloning the author and first post (which is itself a reply to the overall thread), adding a reply with a link to the original thread pointing at the new one then locking the old thread isn't exactly difficult for if you ever do need to deal with continuations.
[16:08:23] <Nilium> Nothing says fun like ~12,000 collections per DB.
[16:08:40] <GothAlice> Nilium: Indeed. MongoDB really requires you to sit down and think about your data model.
[16:08:44] <StephenLynx> lol @ Nilium
[16:08:59] <StephenLynx> by default I keep stuff on the same documents.
[16:09:02] <GothAlice> Nilium: Run into the namespace limits yet? :D
[16:09:17] <Nilium> Haven't _noticed_ that yet but I wouldn't be surprised if it's just that I'm not noticing it.
[16:09:47] <GothAlice> Well, index creation would explode and eventually you wouldn't be able to create new collections.
[16:09:58] <Nilium> Hm, probably haven't seen that yet.
[16:10:23] <StephenLynx> so your system creates collections dynamically?
[16:10:26] <GothAlice> http://docs.mongodb.org/manual/reference/limits/#namespaces < ah, you should be good until you get to ~24,000 namespaces (collections + indexes).
[16:17:23] <Nilium> StephenLynx: Yep. Not my idea.
[16:17:48] <StephenLynx> obviously. I don't expect someone who does that kind of thing to be able to use IRC
[16:18:26] <Nilium> I keep trying to get time to just burn it all to the ground and rebuild it in some manner that's actually optimal, but it's tricky
[16:18:48] <cheeser> burnination!
[16:18:49] <Nilium> Also hard because there are about 48 databases like this and it just makes me want to cry.
[16:19:08] <GothAlice> Clearly you need to mandate some courseware on your co-workers.
[16:19:20] <GothAlice> At my place of work we have silly hats.
[16:19:39] <Nilium> And I'm not even that familiar with MongoDB, I just bothered to read a tiny bit and then looked at our DBs and couldn't figure out how it got to be this way
[16:20:32] <Nilium> The closest I get to silly hats is wearing my Tool hat, 'cause having "Tool" across my forehead all day works wonders.
[16:20:42] <GothAlice> If you're about to do something in production (deploy, mangle data, etc.) you must wear the outrageous hat of silliness. We also have a data design bowler hat. These tend to draw attention of "backup" developers to watch over the shoulder and, in theory, prevent issues.
[16:21:43] <Nilium> That would be somewhat useful here.
[16:22:19] <GothAlice> We also have continuous integration server running that projects captioned pony GIFs on the wall depending on the outcome of a test run.
[16:22:36] <GothAlice> (Also shaming the developer who introduced the fail.)
[16:22:44] <Nilium> Right now we just require every new schema change (in RDBMS land) to go through live ops and our DB admins, and they at least catch a bunch of things there.
[16:23:02] <Nilium> Zero oversight for the Mongo side of things 'cause only a couple people know what's going on there.
[16:23:21] <GothAlice> Ouch.
[16:23:24] <GothAlice> Low bus counts are bad.
[16:23:25] <StephenLynx> I didnt even dared to suggest mongo here on the new system.
[16:23:38] <StephenLynx> I knew the code monkeys would fuck up it royally.
[16:26:59] <GothAlice> http://cl.ly/image/2b391C2Z3c3X3B2m1G04 < ha, testing ponies. Managed to find the pic. :D
[16:27:10] <dsirijus> anyone ever needed to run multiple instances of mongo on the same machine?
[16:27:14] <dsirijus> (for dev purposes)
[16:27:17] <GothAlice> dsirijus: Yup.
[16:27:25] <dsirijus> GothAlice: os x, dear? :)
[16:27:30] <GothAlice> Indeed.
[16:27:35] <dsirijus> GothAlice: do tell. :)
[16:27:39] <Nilium> I use OS X.
[16:28:00] <GothAlice> brew install mongoldb; then I use https://gist.github.com/amcgregor/c33da0d76350f7018875
[16:28:03] <Nilium> It's my compromise to get both a usable shell and software I want.
[16:28:39] <GothAlice> I can't afford to pay myself to be my own sysadmin on Linux. I tried, but I wasted far too much of my time just trying to keep Linux stable as a desktop. (*shakes a fist at nVidia*)
[16:29:18] <dsirijus> GothAlice: oh, basically, multiple so-called "shards"?
[16:29:20] <GothAlice> dsirijus: That script constructs a local 3x3 sharded replica set with authentication enabled to allow my projects to test sharding keys in development.
[16:31:13] <dsirijus> cool, cool.
[16:31:15] <dsirijus> thanks!
[16:31:24] <GothAlice> dsirijus: As for per-project isolation, I generally develop with a single mongod process running via homebrew's normal mongodb launchd integration. They're generally only isolated by database on the same "host".
[16:31:30] <GothAlice> That script is mostly for the automated test suites to run.
[16:32:36] <dsirijus> eh, i should really run many VMs, but that's, even though pure, a PITA
[16:32:56] <GothAlice> Vagrant makes it nice.
[16:33:30] <dsirijus> additional requirement of mine is running on low bandwidth, since i run on mobile internet traffic
[16:33:34] <dsirijus> $5 per GB!
[16:33:38] <dsirijus> sheesh
[16:33:49] <Nilium> Vagrant's nice, SaltStack is nice aside from the documentation and setting it up.
[16:33:56] <GothAlice> So one assumes you have a heavily caching transparent proxy in place to ensure things aren't downloaded twice. ;)
[16:34:03] <Nilium> Once it's set up and I never have to look at it again, though, it's nice.
[16:34:45] <Nilium> I give SaltStack credit for coming up with an environment that makes PHP appear desirable though.
[16:35:24] <dsirijus> GothAlice: hm. good point. i should set up VPS while i'm at that too. but no time, eh.
[16:37:25] <GothAlice> Some days I feel coding is like: http://cl.ly/image/1f2f1Y1m0s3D1X0g0G2j Most of the time not so much, though.
[16:39:47] <dsirijus> heh, i'm actually lovin' it these days. but no time to do it all! especially by-the-book
[16:42:20] <dsirijus> mongo instance that's serving an infrequent loggin purpose (but with fair amount of data per document), shouldn't really require much RAM, right?
[16:42:26] <dsirijus> basically, just writing to it
[16:42:40] <dsirijus> (and it's not even _that_ much data per document)
[16:44:37] <GothAlice> dsirijus: MongoDB uses memory-mapped on-disk stripes. This means it'll always have lots of RAM "wired" to its memory space, but the kernel will decide how much to actually load. There will be a tendency for MongoDB to absorb as much free RAM as possible still, though, based on data set size.
[16:45:48] <GothAlice> My personal at-home dataset far exceeds available RAM—26 TiB—and is almost write-only. Queries are _incredibly_ slow the odd time I need to use them, though. (Not even my indexes fit in RAM, usually a desperate sign one needs to throw more hardware at the problem.)
[16:47:56] <dsirijus> yeah, none of these queries will be app-facing, only for on-demand manual queries
[16:48:40] <dsirijus> and i sure hope mu indexes will exceed RAM - it's invoices accounting :D
[16:49:53] <GothAlice> dsirijus: Y'know, I'd be quite curious as to the data model you're using to store invoice data. We're also doing this at work, and I like to compare notes on design. :)
[16:51:37] <dsirijus> GothAlice: well, one collection is actual invoices, but it's really mock-invoices since i'm not actually selling to users, platforms are (fb, apple, google, ...)
[16:52:38] <dsirijus> the other is records of purchases of virtual goods with virtual currency
[16:53:09] <dsirijus> GothAlice: so, i'm happy to have not need to go through what you will :)
[16:54:21] <dsirijus> though, dunno, i guess it would be nice to finally grok that aspect of paperwork by modelling it
[16:56:05] <GothAlice> dsirijus: https://gist.github.com/amcgregor/655f5d453193b19e280a
[16:56:10] <GothAlice> Obv. many methods are missing. ;)
[16:56:36] <GothAlice> (As are the signal handlers for initial and state transitions.)
[16:59:26] <dsirijus> you're optimizing for key string lengths too?
[16:59:34] <dsirijus> lol. sweeet.
[16:59:55] <GothAlice> Indeed. MongoDB stores the full key with each document, so why make it worse than it needs to be?
[17:00:04] <cheeser> indeed.
[17:00:09] <GothAlice> (Besides, this is why abstraction layers were invented. ;)
[17:00:34] <dsirijus> would be nice if it would optimize it by itself
[17:00:40] <dsirijus> like protobuf dicts or something
[17:00:52] <GothAlice> dsirijus: I'm working on that. ;)
[17:01:10] <dsirijus> hah. awesome.
[17:01:42] <cheeser> you need schema support/validation before mongo can optimize it automatically
[17:02:01] <GothAlice> Yeah, I say I'm working on that, but I'm working on that in the context of the MongoEngine ODM.
[17:02:30] <cheeser> i've debated something like that in morphia, too.
[17:02:55] <cheeser> just not enough gain for the work around maintaining it all just yet.
[17:03:09] <dsirijus> hm, ok, true, might be better delegated to a layer independent of mongo.
[17:10:20] <StephenLynx> "Clueless. A metaprogramming programming language that uses itself to define itself." hHAHA I have one project like that
[17:10:38] <StephenLynx> https://github.com/mrseth/Henge
[17:10:46] <GothAlice> ¬_¬ Yeah. The only loop structure defined in the core language is "forever".
[17:10:46] <StephenLynx> college quality project :v
[17:11:02] <StephenLynx> mine is some soft of storage tool
[17:11:24] <StephenLynx> when I get to think about it, it is sort of access
[17:11:46] <StephenLynx> the cherry of the cake is how I store stuff to XML
[17:12:00] <StephenLynx> I didn't serialized stuff, I just used some weird library that did that
[17:12:17] <StephenLynx> because my objects were incredibly complex, it would be really slow
[17:13:33] <GothAlice> https://gist.github.com/amcgregor/016098f96a687a6738a8 < somewhat more complete documentation. Mine is written on top of RPython and uses Pypy's JIT compiler, so it's pretty darned fast. OTOH, I'm working on making it do insane things, like full dependency graph generation for your code and automatic statement re-ordering for optimized parallelization.
[17:14:00] <GothAlice> (You don't do threads in Clueless, threads do you.)
[17:54:24] <logiii> Hi guys
[17:54:52] <logiii> Anyone know how is mapreduce implemented?
[17:56:00] <logiii> do i implement mapreduce on my node.js application directly on every query, or do mapreduce once and it will remember that?
[18:05:41] <logiii> sorry I got d/c
[18:05:57] <logiii> if anyone said anything to me
[18:41:47] <theRoUS> Derick (or anyone, for that matter): i'm having a real problem with my document's "id" field (over the name of which i have no control) being conflated with and coerced to "_id".
[18:42:16] <theRoUS> i'm at a couple of removes (mongoid in rails) so i'm not sure where it's happening.
[18:42:41] <theRoUS> but the upshot is that "_id" is getting *my* value, and "id" is getting nil.
[18:42:56] <theRoUS> i haven't found anything online (yet) on how to avoid this
[19:07:48] <Sam___> theRoUS: what are you using for an ORM or mapping from your code to mongo documents?
[19:08:36] <theRoUS> Sam___: mongoid
[19:09:45] <Sam___> theRoUS: I am not familiar with mongoid, but are you able to set a field as primary key?
[19:10:15] <theRoUS> Sam___: probably.
[19:10:46] <theRoUS> i don't think it's actually down at the mongodb level; i think it's mongoid doing the conflating/coercion
[19:10:55] <Sam___> no that is correct
[19:10:57] <Sam___> it isn't mongo
[19:11:36] <Sam___> but there may be something in mongoid that you can set the primary key with. This will prevent the _id from getting the value. I had a simliar problem with mongoengine
[19:15:05] <theRoUS> Sam___: i'll check for that. thanks!
[21:15:12] <gabeio> hey so how does one (on linux) run the mongodb under a different user than the mongo user? I just don't know the setting name in the config file
[21:16:50] <gabeio> (with forking)
[21:19:19] <Derick> gabeio: you just start it as a different user
[21:19:38] <Derick> if you mean through services, then there is something else I suppose
[21:19:50] <gabeio> yeah ubuntu services
[21:20:08] <gabeio> do you think it might be in the init.d file?
[21:20:20] <Derick> one sec, I'll check
[21:21:48] <Derick> gabeio: the mongodb file in init.d has something like DAEMONUSER
[21:21:58] <gabeio> kk
[21:22:12] <gabeio> i'll change it there hopefully that works :) thanks
[21:22:25] <Derick> there is a ubuntu way for doing it though
[21:23:31] <gabeio> do you happen to know it? lol I just know the init scripts are just to start/stop/restart/reload/status
[21:23:44] <gabeio> or at least usually
[21:23:53] <Derick> sorry - no
[21:23:55] <gabeio> (im running server mode so i don't have a gui)
[21:24:13] <gabeio> so the "ubuntu way" might be gui and i dont have it ;]
[21:25:06] <Sam___> gabeio: can't adjust the user in the init script
[21:25:18] <Sam___> sorry i mean you CAN
[21:26:20] <gabeio> well i can but is there a better way ?
[21:27:54] <Sam___> gabeio: that is the best way
[21:28:17] <Sam___> that guarantees that mongod starts with the correct user every time
[21:31:09] <scellow> Anyone know a java driver guide to connect to a remote server ?
[21:31:38] <scellow> I wrote a java program to manage my db, but i have issue to connect to my server
[21:33:32] <joannac> http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-java-driver/#making-a-connection
[21:33:39] <dacuca> guys I have mongodb 2.2 with replica set and I want to upgrade to 3.0 (when it is out). What’s the best way to do it?
[21:35:30] <dacuca> I was thinking about adding new nodes to the replica with mongodb 2.4, then shutdown the old ones, and so on. is that ok?
[21:39:20] <gabeio> this is probably a better question than the previous one do any of you know of a better way to allow mongodb to write to an external usb on ubuntu without giving it root?
[21:40:33] <scellow> joannac, thanks ill take time to read that !
[21:40:53] <joannac> dacuca: sure, but that'll be slow
[21:41:11] <dacuca> gabeio: you can add the user to the group that has permission to write on the device
[21:41:22] <dacuca> joannac: but it is safer isn’t it?
[21:41:42] <joannac> dacuca: why not upgrade in place?
[21:42:17] <scellow> joannac, i have already read that guide, i have no problem to connect to my test local db, the problem is i can't connect to the remote server, the ports are open, mongod is correctly binded to localhost, the port is correct, but unable to connect
[21:42:20] <dacuca> I’m just afraid of data corruption or something
[21:42:53] <dacuca> we don’t want downtime
[21:42:55] <joannac> scellow: what? mongod is correctly b binded to localhost? how is that "correct"
[21:43:23] <gabeio> there is no way to change the permissions on the usb... it is set to root:root >.> I tried sudo chown root:other group it just freaks
[21:43:50] <dacuca> gabeio: try to mount the external usb as the user who runs mongodb
[21:44:09] <joannac> dacuca: okay, so take backup before you upgrade, like you should do anyway
[21:45:37] <scellow> joannac, i was talking about the mongodb.conf file, from what i have understoop , mongodb is correctlly listening to 127.0.0.1 and can accept connection from the port 27017, so i should be able to access the db ? im wrong ?
[21:45:40] <dacuca> if the plan is switching to wiredtiger probably the best plan is to backup -> jump from 2.2 to 3.0 -> restore backup, isn’t it? joannac
[21:46:20] <dacuca> since I read somewhere that downtime is not an option when switching to wiredtiger
[21:47:12] <gabeio> dacuca you are making me feel like a noob at this but I suck at file systems how would I go about mounting it under another user and keep it mounted there automatically on restarts? if it's a command I know I can add it to the rc.locals startup script
[21:48:23] <dacuca> gabeio: read about fstab
[21:48:24] <joannac> scellow: totally wrong. bindIp means it listens only on 127.0.0.1
[21:48:39] <dacuca> http://ubuntuforums.org/showthread.php?t=1880685
[21:49:12] <joannac> dacuca: no.
[21:49:26] <joannac> just do your upgrade plan if it makes you feel better
[21:49:54] <joannac> as long as your databases aren't massive it should only be a bit slower
[21:51:07] <scellow> joannac, but it doesn't answer to my problem, why then i can't connect to the server, if it listens to 127.0.0.1 and if the port 27017 is open ?
[21:51:25] <joannac> scellow: because you're not connecting on localhost, you're connecting remotely?
[21:51:29] <scellow> what would be wrong, (beside me :p)
[21:52:45] <dacuca> joannac: is there a plan to upgrade to wiredtiger (from mmap) without downtime then?
[21:53:38] <scellow> joannac, mongodb is on my server, i'm trying to connect to that server from my computer, mongoClient = new Mongo("myHost" , 27017); and i get a connection refused, i checked the port and it's open
[21:54:41] <joannac> scellow: do you understand the difference betwee "local" and "remote"?
[21:55:28] <joannac> dacuca: yes?
[21:55:28] <scellow> joannac, of course i do
[21:55:36] <dacuca> that’s probably a binding issue scellow
[21:55:59] <joannac> scellow: is the connection string you pasted a local or remote connection?
[21:56:37] <scellow> joannac, myHost = my server's host
[21:56:50] <joannac> scellow: yes, i know. answer my question
[21:57:40] <scellow> joannac, a remote connection
[21:57:47] <joannac> scellow: correct.
[21:58:08] <joannac> your mongod has bindIP 127.0.0.1, which means it only accepts LOCAL connections
[21:58:23] <joannac> fix it. either bind it on a remote IP as well, or remove bindIP altogether
[21:59:12] <dacuca> you can try switching 127.0.0.1 to 0.0.0.0
[22:01:36] <scellow> ohh, i think things wasn't clear in my head .. i guess i'll have to read more article about that stuff
[22:04:29] <dacuca> did you solve the issue scellow ?
[22:04:53] <scellow> dacuca, i will try
[22:05:14] <scellow> reloading the mongodb config will close all the current open connection ?
[22:09:20] <dacuca> scellow: probably…
[22:10:48] <scellow> Ok, i have to wait then :p
[22:22:29] <jpfarias> hey guys
[22:25:08] <jpfarias> so I’ve got a database that is quite large, it has 200GB of data in 2 collections and about 60 million objects (15 million in one collection and 45 million in the other), this database is running off of a single machine with 24GB of memory. My question is, would queries run faster if it was running sharded over X machines with 24/X GB of memory? like, 2 machines with 12GB of memory each? or 10 machines with 2.4 GB of memory e
[22:40:05] <Sam___> jpfarias: ....yes if you build the shard correctly and have the right shard key
[22:40:25] <jpfarias> hey Sam___
[22:41:09] <bybb> Hi all!
[22:41:26] <Sam___> jpfarias: hi
[22:41:30] <Sam___> bybb: hi
[22:41:43] <bybb> Can I rs.reconfig the replica set name (_id)?
[22:41:47] <jpfarias> I’m not sure what is the advantage of sharding...
[22:41:55] <bybb> I tried and it didn't work
[22:42:39] <Sam___> jpfarias: there are many advantages to sharding.....distributed queries...means faster repsonse time....also means you can scale
[22:42:54] <Sam___> bybb: what is the command you issued?
[22:42:58] <bybb> I get a not "running with --replSet" message
[22:43:19] <bybb> Sam__ I'm followinf this http://docs.mongodb.org/manual/reference/method/rs.reconfig/#rs.reconfig
[22:43:36] <bybb> using rs.conf() and rs.reconfig()
[22:43:39] <Sam___> bybb: do you have a replica set? sounds like you are in single mode
[22:43:52] <Sam___> rs. is only reserved for replica sets
[22:44:27] <Sam___> you need to start 3 instances (2 nodes and arbiter) then rs.initiate()
[22:44:34] <Sam___> then rs.add() to add teh other 2
[22:44:39] <Sam___> then you will be in a replica set
[22:44:58] <bybb> well I'm quiet lost, running with a replica set returns this "set name may not change"
[22:45:17] <Sam___> jpfarias: we can discuss the pros and cons of sharding if you would like....but if you want to scale you data set and make your queries faster then you will need to shard with mongo
[22:45:42] <Sam___> bybb: how did you start your mongod?
[22:45:48] <Sam___> command line or with init file?
[22:46:02] <jpfarias> Sam___: I see, I just didn’t find much information about when to shard or not
[22:46:17] <bybb> Sam__ for the replica set I use a conf file
[22:46:32] <jpfarias> or if it is better to have more smaller machines or fewer bigger machines
[22:46:41] <Sam___> jpfarias: it is different for everyone to be honest....resources, velocity of data, etc.
[22:46:48] <Sam___> more smaller is always better
[22:46:55] <jpfarias> oh really?
[22:46:59] <Sam___> yes
[22:47:07] <jpfarias> any reason why?
[22:47:08] <Sam___> if you have the budget
[22:47:22] <Sam___> smaller data sets return faster results
[22:47:38] <Sam___> it really depends on your use case and data to be honest....different for everyone
[22:47:47] <jpfarias> ok
[22:47:51] <Sam___> but the best part of monogo is it scales out not up...so you don't need big machines
[22:48:08] <Sam___> bybb: i think we need to establish some definitions....
[22:48:16] <Sam___> replica set = 3 mongods
[22:48:17] <bybb> sure
[22:48:33] <Sam___> each mongod is started with an independent init script?
[22:48:51] <Sam___> each mongod has a seperate config file?
[22:49:33] <bybb> I was initiating a replica set while I saw my mistake naming replset
[22:49:42] <bybb> yes
[22:50:00] <bybb> but for now my replica set is one mongod
[22:50:03] <Sam___> are you using yaml config files?
[22:50:15] <Sam___> doesn't matter either way...but you need to make sure that all the config files match
[22:50:24] <Sam___> replSet in each one should have the same name
[22:50:34] <Sam___> stop all of the mongod processes and check the config files
[22:50:43] <bybb> yeah that was my mistake
[22:51:15] <bybb> they have the same name now, but it's different from the one I wrote at the beginning
[22:51:34] <bybb> does it make sense?
[22:51:50] <Sam___> yeah
[22:51:55] <bybb> right now I have only one member
[22:52:00] <Sam___> so you need to stop the mongod processes
[22:52:03] <Sam___> all of them
[22:52:09] <bybb> ok
[22:52:13] <Sam___> then go into /data/db and delete all files
[22:52:17] <Sam___> then start them back up
[22:52:27] <Sam___> rs.intiate()
[22:52:36] <Sam___> i am assuming there is no data in this replica set that you care about?
[22:53:09] <bybb> nothing yet
[22:53:13] <Sam___> k
[22:53:17] <bybb> ok easy
[22:53:29] <Sam___> just clear out all the mapped files
[22:53:36] <Sam___> then restart and build the replica set again
[22:54:15] <bybb> thanks I'll do that and let you know if you're still around
[22:54:31] <Sam___> k... I charge by the minute
[23:05:02] <bybb> do you execute rs.initiate() on each server?
[23:06:34] <Sam___> bybb: nope
[23:06:37] <Sam___> just the primary
[23:06:45] <bybb> ...
[23:06:48] <Sam___> then you rs.Add(server:port)
[23:07:05] <Sam___> just rs.initiate on one of them
[23:07:46] <bybb> oh I just need to start mongod on the other servers?
[23:07:51] <Sam___> yeah
[23:07:57] <Sam___> then add them
[23:08:43] <bybb> I have three members from rs.conf() I ran on the first server
[23:09:04] <bybb> Is the replica set on?
[23:09:42] <bybb> ok I just understood the meaning of a sentence
[23:09:58] <bybb> forget it
[23:11:19] <bybb> thanks a lot, I'll handle this mistake by myself
[23:12:00] <Sam___> you don't need to run rs.conf() at all
[23:12:33] <Sam___> sorry i miss read that...you can run rs.conf() to see the replica set
[23:12:38] <Sam___> you don't have to turn it on or off
[23:12:57] <Sam___> mongod's are all independent...replica set just means you have linked them together
[23:13:06] <Sam___> and they will replicate data from primary to secondary
[23:13:08] <bybb> yeah no worries, I thought I needed to configure the replica set on each server
[23:13:14] <Sam___> k
[23:13:27] <Sam___> so you should be good to go with your replica set now
[23:14:28] <bybb> until the next surprise... I've been playing for two days with VirtualBox, Debian and MongoDB
[23:14:39] <bybb> when you're not used to, it's kind of a battle
[23:14:43] <Sam___> lol
[23:14:48] <Sam___> yeah a little.
[23:14:55] <Sam___> you should throw up a aws machine
[23:15:02] <Sam___> create a VPC in aws...its free
[23:16:17] <bybb> I even forgot why I didn't do that...