PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 27th of January, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:35:34] <olso> hey, guys, how to speed up this? mongo.collection.find({baseModel:'CK752E001'}) ... it takes around 100ms, there are 48k docs in collection
[00:53:59] <unholycrab> when i add a new replica set secondary member, it takes 3-5 days to sync up with the rest of the replica set. there any way i can monitor the progress of the index building phase ?
[00:54:05] <unholycrab> as the index building phase seems to take the longest
[00:56:12] <unholycrab> im referring of course to parts of the STARTUP2 state
[00:57:57] <unholycrab> tailing the logs, i get something like this:
[00:58:00] <unholycrab> Tue Jan 27 00:55:09.006 [rsSync] Index: (2/3) BTree Bottom Up Progress: 842230600/94654761788%
[00:58:27] <unholycrab> 88% of some index its building
[00:58:40] <unholycrab> this doesn't help me determine overall progress
[01:12:37] <harph> I have an old master-slave model running that I can't shut down yet and I want to spin up a new instance that with the same data and sync with master but I need it to be a replSet to be able to use mongo-connector with elastic search. Is there a way to do this? I'm getting the following error when I pass that flag: SEVERE: Failed global initialization: BadValue replication.replSet is not allowed when slave is specified
[01:17:07] <jrua_> hi there!
[01:17:13] <jrua_> quick question here : How many failover nodes do I have on a 5 node replica set where 1 is an arbiter?
[01:19:46] <jrua_> anyone ?
[01:20:11] <cheeser> what is a failover node?
[01:20:28] <unholycrab> jrua_: you mean you have 1 Master, 3 Slaves, 1 Arbiter ?
[01:20:41] <cheeser> primary, secondary, arbiter
[01:20:45] <unholycrab> sorry Primary/Secondary
[01:20:48] <cheeser> master/slave is something else
[01:20:48] <cheeser> :D
[01:20:52] <unholycrab> yeah yeah
[01:21:37] <jrua_> unholycrab: yep correct
[01:22:03] <jrua_> so three failover nodes right ?
[01:22:18] <unholycrab> jrua_: paste your rs.conf()
[01:22:20] <jrua_> as the arbiter do not handle any data
[01:22:28] <unholycrab> into a pastebin or gist.github.com
[01:22:42] <unholycrab> an arbiter can't become primary, but a secondary can, depending on your replica set configuration
[01:23:23] <jrua_> just reading documentation and this doubt come into my mind
[01:23:28] <jrua_> no worries, thx
[04:31:51] <Jonno_FT1> how can I make a mongo instance on one machine clone to another, without using replicas?
[04:36:51] <cheeser> rsync?
[04:36:59] <Jonno_FTW> kk
[04:37:02] <Jonno_FTW> makes sense
[04:37:16] <Jonno_FTW> and I can just copy the data directory over?
[04:37:30] <cheeser> "probably"
[04:37:35] <Jonno_FTW> >_>
[04:37:39] <cheeser> yep! covered.
[04:39:44] <Jonno_FTW> cheeser: what about db.cloneDatabase()?
[04:39:57] <Jonno_FTW> I have a 1GB link between the two machines
[04:40:04] <Jonno_FTW> *Gb
[04:41:01] <cheeser> that could, yeah.
[04:41:10] <cheeser> it's a "pull" though not a push.
[04:41:18] <cheeser> just fyi
[04:41:19] <Jonno_FTW> either is fine
[10:30:37] <awsmsrc> hey peeps, i have a quick question on unique indexing if anyone has time
[10:30:52] <awsmsrc> i want a unique index where field == some string
[10:31:25] <awsmsrc> i.e i have a state machine and i only want a user to be in an “available” state in one room/call/channel
[13:06:08] <nukeu666> in aggregation which syntax is correct? aggregate({...},{...},{...}) or aggregate([{...},{,,,}])?
[13:06:26] <nukeu666> i see both of them in examples
[13:43:08] <StephenLynx> nukeu666 it depends
[13:43:21] <StephenLynx> in nodejs the aggregation syntax is different
[13:43:44] <StephenLynx> and uses the second example
[13:43:48] <StephenLynx> the console uses the first
[13:43:53] <StephenLynx> I don't know about other drivers
[14:18:51] <g-hennux> hi!
[14:20:05] <g-hennux> I use text indexes with (a rather old version, 2.4.9) of mongodb and my results are quite counterintuitive when the indexed text contains special punctuation characters
[14:21:06] <g-hennux> so even though I have a document with a field "Set „Easy 8“" (and that field has twice the weight of the other fields in the text index), that document will get a really low score
[14:21:22] <g-hennux> in fact, even when i search for "easy 8", it is last in the last
[14:21:37] <g-hennux> now I am wondering if the punctuation does something weird to the indexing
[15:20:26] <RaceCondition> when should I be using _id: ObjectId("...") vs _id: "..."?
[15:20:43] <RaceCondition> I'm seeing inconsistent behavior: with one collection (a non-capped one), I can use both, with another one (capped), I can only use the former
[15:20:53] <RaceCondition> same with both find() and update()
[15:23:10] <RaceCondition> this is what's happening: http://pastebin.com/wtHvfAri
[15:23:24] <RaceCondition> with db.jobs, I can use both versions; with db.queue, I can only use the ObjectId(...) one
[15:23:28] <RaceCondition> why is that?
[15:23:41] <StephenLynx> I never use _id
[15:23:53] <StephenLynx> neither the other one
[15:24:17] <StephenLynx> mongo just messes so much with it I rather just roll with my own index.
[15:24:27] <RaceCondition> really?
[15:24:36] <RaceCondition> is that standard practice? that I should not use _id at all
[15:24:44] <Derick> RaceCondition: you should use _id
[15:24:45] <StephenLynx> I don't know about other people.
[15:24:55] <RaceCondition> Derick: as opposed to ...?
[15:25:04] <StephenLynx> I know it makes no difference performance wise to set your own index
[15:25:05] <RaceCondition> as opposed to what StephenLynx is suggesting?
[15:25:12] <Derick> RaceCondition: not as opposed to anything
[15:25:33] <StephenLynx> there is another problem with _id, it is not user friendly
[15:25:36] <RaceCondition> because my question is `_id: ObjectId("abc")` vs `_id: "abc"`
[15:25:39] <Derick> StephenLynx: ?
[15:25:50] <StephenLynx> you can't say "thread id = 6546546546" when its the first thread created in a forum, for example
[15:25:52] <RaceCondition> when do I need to wrap the BSON IDs in ObjectId and when not
[15:25:53] <Derick> RaceCondition: yes, use your own ID value if you can - if not, use ObjectID.
[15:26:14] <Derick> RaceCondition: you should alays wrap real Object IDs in an Object ID
[15:26:22] <Derick> but you can pick your own non-object-id values for _id
[15:26:47] <RaceCondition> so what you're saying is that Mongo remembers whether I saved an object with _id:"foo" vs _id:ObjectId("foo"), and I have to use the same value later?
[15:26:58] <StephenLynx> and since mongo don't have joins, you either makes two queries to get an user friendly value or you display the ugly-ass _id
[15:27:00] <RaceCondition> i.e. if I save {_id: "foo"}, I can't later query it with {_id: ObjectId("foo")}?
[15:27:46] <StephenLynx> and then there are the rules on projecting
[15:27:47] <Derick> RaceCondition: yes, you need to use the same value later
[15:27:56] <Derick> "foo" and ObjectId("foo") are absolutely not the same
[15:28:12] <StephenLynx> the main problem with _id is that is does not behaves as a regular field.
[15:28:26] <StephenLynx> and has all these little rules
[15:28:37] <Derick> you can store ObjectIDs in other fields too...
[15:28:51] <StephenLynx> not talking about that.
[15:29:15] <RaceCondition> Derick: gotcha... I never realised that was the case; I though _id was *always* converted to ObjectId internally
[15:29:25] <Derick> RaceCondition: nope, it can be any (scalar) datatype
[15:29:38] <RaceCondition> ok... got it
[15:30:29] <RaceCondition> but wait, Derick, that can't be true
[15:30:44] <RaceCondition> I can query my db.jobs with both ObjectId("foo") and "foo", and both return the same document
[15:31:41] <Derick> I don't believe that
[15:31:45] <RaceCondition> also, the document that gets returned either contains _id:"foo" or _id:ObjectId("foo") depending on which one I used to query
[15:31:53] <Derick> you have two documents
[15:31:59] <Derick> one with each _id value
[15:32:00] <StephenLynx> telling you, _id is a fucking mess.
[15:32:16] <Derick> RaceCondition: do a find without arguments, you'll see it
[15:32:23] <RaceCondition> oh, crap, actually you're right
[15:33:08] <RaceCondition> so what does ObjectId do exactly then?
[15:33:08] <xcesariox> is anyone here to help me out?
[15:33:08] <xcesariox> anyone active?
[15:33:17] <Derick> RaceCondition: it's a wrapper around some optimised binary data
[15:33:17] <RaceCondition> xcesariox: why don't you just ask your question?
[15:33:19] <xcesariox> RaceCondition: it stores an incremental ID
[15:33:32] <RaceCondition> incremental ID?
[15:33:37] <Derick> that's not quite true
[15:33:56] <RaceCondition> ok, so it just stores those 96 bits, instead of the 24char string; got it
[15:34:19] <xcesariox> Derick: 1 minute, let me paste it into a gist.
[15:34:19] <Derick> an ObjectID consists of 4 bytes unix timestamp, 3 bytes hostname part, 2 bytes process Id and 4 bytes counter (to prevent collisions within each second)
[15:34:38] <Derick> hmm, I got one of those numbers wrong
[15:34:50] <Derick> but yes, it stores binary data, and not that 24 char string
[15:35:03] <RaceCondition> now the question is, if I want to use ObjectId but I'm using JSON internally to represent these documents, why can't I use $oid to represent an ObjectId as JSON?
[15:35:22] <Derick> RaceCondition: I don't quite get that?
[15:35:46] <RaceCondition> I mean, is there a way to encode an ObjectId in the JSON representation of a BSON document
[15:36:25] <Derick> not precisely
[15:36:36] <Derick> we do something like "Extended JSON", but that's not really a standard
[15:36:48] <Derick> and not all drivers support it
[15:37:03] <RaceCondition> even on the mongo REPL, I cannot use {"$oid": "foo"}
[15:37:15] <Derick> yes, you can not use field names starting with a $
[15:37:27] <xcesariox> Derick: where do i put this "results = Book.collection.map_reduce(map, reduce, out: "vr")" command syntax into? into rails console or mongodb console directly? https://gist.github.com/shaunstanislaus/0f8d87939c0ab01ce5d6
[15:37:29] <RaceCondition> so can I represent an ObjectId with this "Extended JSON" somehow?
[15:37:57] <Derick> RaceCondition: you have already seen it, it's this $oid stuff that the shell shows :-/
[15:38:20] <xcesariox> Derick: i am actually following a tutorial book but this syntax doesn't seems to work for me. it says unexpected token when i type it into mongo console.
[15:38:42] <Derick> xcesariox: It looks like Ruby - it's not valid javascript syntax
[15:38:58] <xcesariox> Derick: so what should i do
[15:39:15] <xcesariox> Derick: but this is not ruby results = Book.collection.mapReduce(map, reduce, out: "Tom")
[15:39:18] <RaceCondition> Derick: when does the shell use $oid? I haven't seen it; it only seems to use ObjectID
[15:39:26] <Derick> RaceCondition: oh - hmm
[15:39:50] <Derick> you're right, my bad
[15:39:56] <Derick> where did you find this $oid then?
[15:39:58] <xcesariox> Derick
[15:40:06] <Derick> xcesariox
[15:40:29] <xcesariox> Derick: how do i do it correctly
[15:40:36] <RaceCondition> Derick: I guess it's a driver thing then... and not at all natively supported by Mongo at all
[15:40:37] <Derick> xcesariox: sorry, I don't know ruby
[15:40:45] <Derick> RaceCondition: yes - i think so too
[15:40:47] <xcesariox> Derick: i am asking about mongo here.
[15:40:50] <Derick> RaceCondition: which driver do you use?
[15:40:55] <RaceCondition> reactivemongo
[15:41:01] <xcesariox> @Derick: results = Book.collection.mapReduce(map, reduce, out: "Tom")
[15:41:20] <Derick> xcesariox: I still don't know - that looks ruby, not mongodb javascript shell stuff
[15:41:39] <xcesariox> Derick: can you teach me how mapreduce work
[15:42:05] <Derick> RaceCondition: ah, so the scala or java driver might do that then
[15:42:33] <RaceCondition> Derick: ReactiveMongo *is* the driver, but yeah, I get your point
[15:43:20] <StephenLynx> xcesariox do you know a guy nicknamed OP and live in vancouver?
[15:43:32] <StephenLynx> nvm
[15:43:35] <StephenLynx> you probably dont
[15:43:45] <xcesariox> StephenLynx: whats with that question
[15:43:54] <StephenLynx> I thought you were some dude
[15:43:59] <StephenLynx> because if "shaun"
[15:44:08] <StephenLynx> of*
[15:44:45] <xcesariox> StephenLynx: how the fuck you know i am shaun
[15:44:53] <xcesariox> StephenLynx: i didn't put it in my second nick or anything
[15:45:20] <StephenLynx> lol
[15:45:29] <Derick> github.com/shaunstanislaus/ ?
[15:45:31] <StephenLynx> that
[15:45:35] <xcesariox> lol
[15:46:11] <StephenLynx> lost my shit
[15:46:30] <cheeser> it's right over there
[15:49:43] <RaceCondition> Derick: just for the record, turns out ReactiveMongo does support the {"$oid" -> ...} notation, so all is good
[15:50:09] <RaceCondition> I'd just been intermixing ObjectIds and strings all along, not realizing they're completely distinct types
[15:51:11] <Derick> ok :-)
[16:10:05] <rodfersou> hi everyone
[16:12:29] <xcesariox> where do i put this "results = Book.collection.map_reduce(map, reduce, out: "vr")" command syntax into? into rails console or mongodb console directly? https://gist.github.com/shaunstanislaus/0f8d87939c0ab01ce5d6
[16:12:39] <xcesariox> can anyone help me out?
[16:12:51] <rodfersou> in one collection I save the friendly url as the 'slug' attribute of my document, and ensure the index on this field... my question is.. knowing that I'll use the _id as my sharding key when I change to a cluster, do my database will get slower when I find the document with 'slug' attribute?
[16:12:59] <cheeser> well, that's not valid in the shell, for sure.
[18:25:45] <klmlfl> Hi room , I have a replica set running in staging environment, can I get some pointers on how to test it?
[18:27:52] <klmlfl> I am thinking about testing beyond just cycling the service on and off
[18:37:35] <Tyler__> So if I do a findOne and the document doesn't exist, does the code in the callback function still execute?
[18:40:09] <Tyler__> User.findOne({code:req.body.code},function(err, doc){ Does this part still execute if it can't find one? });
[18:49:48] <rodfersou> Tyler__: for me this doesn't make sense.. but try to make a simple test database, with test data and see what happen
[18:50:02] <rodfersou> Tyler__: sorry.. I'm still learning too :)
[18:50:08] <Tyler__> lol thanks dude :)
[19:54:01] <theRoUS> Derick: you ever work with rails and mongoid? or know of anyone?
[19:57:03] <joep_> General question: say I have a tree of documents and I need to get from the bottom-most node to the top node, what is the most efficient way to do that in MongoDB?
[19:57:32] <joep_> (presently I have back-references on each document and run a query on each node to get its parent; so, for N depth there are N queries)
[20:04:34] <joep_> (and by "back-reference" I mean a reference to the parent node)
[20:05:08] <neo_44> joep_: there are a lot of considerations for that question.....but if you just want to go from bottom to top...query for a node with no parent.
[20:05:29] <neo_44> but if you want to take the path from the child to the top....that is different
[20:05:36] <joep_> yes, that
[20:05:45] <joep_> so, I have a child node, I need to know its top most parent
[20:06:10] <joep_> Any recommendations?
[20:06:12] <neo_44> that would always be the head...
[20:06:15] <neo_44> for top most
[20:06:31] <neo_44> do you need to know the other child/parents along the route...or just the top most parent?
[20:06:40] <joep_> just the top most
[20:07:16] <joep_> so, say I have X => Y => Z, where X is the child and Z is the top most parent, given X I want Z
[20:07:43] <neo_44> k...and how are they realted?
[20:07:49] <neo_44> related*
[20:07:52] <neo_44> by an id?
[20:08:08] <joep_> yea; X has a DBRef to Y, and Y has a DBRef to Z
[20:09:27] <neo_44> there isn't anything server side that will help that issue...it is a matter of how you are storing it
[20:09:37] <neo_44> why don't you use a nested document?
[20:10:08] <neo_44> each node could be a sub document
[20:10:16] <joep_> ah, great question; the reason I didn't nest the document was because some documents may be owned by more than one document
[20:10:19] <neo_44> it would be 1 query server side....and logic client side
[20:11:11] <neo_44> do they need to be in sync at all? if not you could duplicate the data in both nested documents?
[20:11:32] <joep_> yea, I thought about that too. Unfortunately, data consistency is highly important.
[20:11:59] <joep_> so, if I change a sub-document, I would have to always update all documents with that sub-document
[20:12:36] <neo_44> do you know before you create one document if it will be shared?
[20:12:40] <neo_44> or is that dynamic?
[20:13:43] <joep_> in most cases its dynamic; i.e. there *are* times when the document isn't shared, but becomes shared at a later time
[20:13:51] <neo_44> k
[20:14:19] <neo_44> what is more important....speed or complexity?
[20:14:32] <neo_44> or shall i say.....if it is complex and fast is that okay?
[20:14:51] <joep_> to give a little context, the system has many collections where each collection of documents gets curated, modified, and associated in varying order
[20:15:41] <joep_> simplicity is probably more important since I have a team I will have to educate if it is complex
[20:15:47] <neo_44> is there a service or data access layer? or is the client writing directly to the databses?
[20:17:00] <joep_> we are using Doctrine ODM, an ORM written in PHP; but, we are realizing some limitations, specifically this problem: given X, give me its top most parent Z in an efficient manner
[20:17:31] <neo_44> yeah...kinda need another abstraction layer on top of that
[20:18:36] <neo_44> i user mongoengine(python) and then have another abstraction on top of it for repositories(database access), and then the entity layer...so I can switch out the underlying data store or have more than one at a time
[20:18:46] <neo_44> there isn't an easy fix for your issue....
[20:18:57] <neo_44> i would most likeyl have a meta store just for the relationships
[20:19:11] <neo_44> but you would have to constantly keep it updated...very complex
[20:19:11] <joep_____> whoops
[20:19:29] <neo_44> lol
[20:19:30] <joep_____> I think I managed to duplicate my session; anyway...
[20:19:48] <neo_44> !last necrogami
[20:19:54] <pmxbot> Sorry! I don't have any record of necrogami speaking
[20:19:55] <joep_____> Okay, well its good to know there isn't an obvious: "Hey, you should be doing *this* or *that*"
[20:20:12] <neo_44> !last neo_44
[20:20:12] <pmxbot> I last saw neo_44 speak at 2015-01-27 20:20:12+00:00 in channel #mongodb
[20:20:18] <neo_44> !last 20 neo_44
[20:20:18] <pmxbot> Sorry! I don't have any record of 20 neo_44 speaking
[20:20:35] <joep_____> Hey, thanks a lot neo_44!
[20:20:38] <neo_44> np
[20:20:43] <joep_____> Live long and prosper.
[20:20:48] <neo_44> you as well
[20:28:56] <keso> anybody knows morphia for java?
[20:29:19] <cheeser> what's up?
[20:30:30] <Rodrive> Hi, i have some problem with the start of mongo when i reboot on centos 7. I'm having this log "ERROR: Cannot write pid file to /var/run/mongodb/mongod.pid: No such file or directory". According to github there was a fix on 2.6.5, i am in 2.6.7 and still having the problem.
[20:31:25] <hephaestus_rg> hey i'm trying to connect to my mongolab provided mongodb with the shell command, and it seems to be timing out. any ideas
[20:31:31] <cheeser> does that path exist, Rodrive ?
[20:32:10] <Rodrive> No, not on reboot, that's all the problem. See : https://github.com/mongodb/mongo/commit/50ca596ace0b1390482408f1b19ffb1f9170cab6
[20:33:34] <unholycrab> is there any way to determine the overall progress of the BTree Bottom Up indexing step, during the initial syncing a new Secondary member?
[20:33:40] <unholycrab> log output: Tue Jan 27 20:04:45.006 [rsSync] Index: (2/3) BTree Bottom Up Progress: 782212700/94654761782%
[20:34:06] <unholycrab> the 82% doesn't tell me anything about the overall progress at all
[20:35:05] <unholycrab> its like saying "82% of an unknown piece of unknown size out of unknown number of pieces whose sizes are unknown"
[20:35:11] <unholycrab> great! cool.
[20:35:25] <unholycrab> 82% here we go
[20:55:08] <bdiu> Are there any particular performance implications against using a "composite" object as the _id in a document? e.g., _id:{date:"1/2/3",user:ObjectId(...),type:"daily"}
[20:57:18] <neo_44> bdiu: in that format yes
[20:57:30] <cheeser> indexes would be larger
[20:57:47] <cheeser> that's a weird key...
[20:57:59] <neo_44> i use composite indexes just never with the ObjectId also
[20:58:25] <neo_44> more like _id : "email something@somthing.com 1234"
[20:58:36] <bdiu> cheeser: so, indexes actually map to the primary key as stored internally?
[20:58:46] <neo_44> bdiu: there are no primary keys
[20:58:51] <bdiu> well the _id
[20:58:52] <neo_44> but the _id is always indexes
[20:58:55] <bdiu> okay
[20:59:11] <neo_44> but that is why it is good to use composites in the _id
[20:59:21] <neo_44> if you don't need the ObjectId...don't waste the index
[20:59:28] <bdiu> gotcha
[21:00:29] <hephaestus_rg> is there some way to reset? my db on mongolab?
[21:00:39] <hephaestus_rg> i can't connect to it, it keeps timing out L(
[21:06:32] <cheeser> hephaestus_rg: you'd have to talk to the mongolab people
[21:09:12] <hephaestus_rg> cheeser i found a way to restart it on their website. i'll see if that helps...
[21:13:46] <ddod> Can anyone point me towards how I would get a bunch of random documents at once?
[21:14:41] <ddod> (All the solutions I've found so far are for only getting one at a time)
[21:14:46] <GothAlice> ddod: Fetching random documents is actually surprisingly hard.
[21:14:52] <GothAlice> Multiple at once, doubly so.
[21:15:07] <ddod> That's a shame
[21:15:27] <ddod> I was going to use the skip method if I had to do one at a time
[21:15:41] <GothAlice> It'd be easier if we could have direct access to the _id index, then we'd just need to know the count of buckets and bucket size, then randrange several times over it as required by the .limit().
[21:17:05] <ddod> ok, so I guess async + one at a time is the way to do it
[21:17:31] <GothAlice> There should totally be a .sort_by('$bogo') random sort… there's always the statistical chance of bogosort sorting correctly in a single try, but most of the time you'll get back things in a random order… ;)
[21:20:46] <unholycrab> i posted on stack overflow: http://stackoverflow.com/questions/28180167/how-to-determine-overall-progress-of-the-btree-bottom-up-step
[21:21:08] <GothAlice> https://jira.mongodb.org/browse/SERVER-533 < relating to random sort
[21:36:40] <neo_44> ddod: is that for testing? or something you need to do in prod?
[21:37:20] <ddod> neo_44: production
[21:37:55] <neo_44> skip will pull the document into memory...just FYI...doesn't actually skip it.
[21:38:17] <neo_44> I would use 2 queries...first one that projected on only _id...build array...then randomly pull documents with those _ids from the database
[21:40:54] <ddod> as in, get a list of all IDs first and then do a $in to grab all the chosen docs?
[21:45:43] <neo_44> yeah...and you can choose rand() from the array of IDs that you have in the array.
[21:56:02] <Synt4x`> how do I remove a single element from a document? (I added it using this: b.team_stats.update({'id':t1_stats['id'], 'game__id':c}, {'$set': {'poss' : t1_poss, 'pace':t1_pace, 'dRTG':t1_dRTG, 'dEFG':t1_dEFG, 'dTOV':t1_dTOV, 'dREB':t1_dREB, 'dFT':t1_dFT} }) )
[22:07:26] <neo_44> Synt4x`: $unset
[22:14:34] <Synt4x`> neo_44 have never heard of that, thank you!
[22:14:58] <neo_44> np
[22:30:38] <dsirijus> ah, lol. actually got to read up the docs on mongo's site today (and have been using it for a year or so already)...
[22:31:46] <dsirijus> dudes, there's should be required preface on everything that says anything about mongo saying "it's not a relational database. atomic is document operation. minimize atomic op count. there's no schemas."
[22:32:16] <dsirijus> i have built basically a fully relational database with a lot of schemas with it :D
[22:33:24] <dsirijus> and, best bit is - i didn't even need relations, basically everything is "contains" relation :D
[22:33:49] <dsirijus> "some" refactoring is due, i believe
[22:44:47] <GothAlice> dsirijus: http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html is a good read. :)
[22:46:51] <dsirijus> you know what's another good read? mongodb docs! :D
[22:47:01] <dsirijus> thanks
[22:48:39] <dsirijus> aaand done. that was a short read :D
[22:51:07] <GothAlice> Short, but good. Covers several aspects of migrating from a relational design to document-based one.
[22:54:32] <dsirijus> mhm. i'm ashamed to say it, but i turned out to be a dirty hipster with it :/
[22:54:46] <dsirijus> eh eh. live and learn.
[22:55:45] <dsirijus> it's actually pretty sweet db from this point of new understanding
[22:55:45] <GothAlice> I'm currently working on a MongoDB-backed cMS (component management system). https://gist.github.com/amcgregor/4cefefa4a12c9c76a970#file-contentment-model-py-L421-L490 is an example site layout and first page element (from the static content at https://rita.illicohodes.com/). Scroll up for the "schema". ;^)
[22:55:48] <dsirijus> simple
[22:56:18] <dsirijus> i still think i'll keep schemas
[22:56:28] <dsirijus> from development perspective, they're pretty useful
[22:56:55] <GothAlice> Indeed. https://gist.github.com/amcgregor/4cefefa4a12c9c76a970#file-contentment-model-py-L285-L315 < the "Asset" schema from the cMS, as an example.
[22:56:56] <dsirijus> it's just that it's good to be aware that it's an artificial limitation/abstraction
[22:57:41] <GothAlice> Indeed. My cMS example explicitly allows an arbitrary (schema-free) set of "properties" on each Asset to make more flexible use of the underlying DB.
[22:58:11] <dsirijus> aaah, a nice little "trick"!
[22:58:14] <dsirijus> i like it
[22:58:18] <dsirijus> best of both worlds
[22:58:23] <GothAlice> Exactly. :)
[22:58:59] <dsirijus> good good, i'll implement it like zo too
[23:01:01] <GothAlice> For me the biggest advantage of a declarative schema (this style of schema) is that I can mix and match behaviours. I.e. "Taxonomy" knows how to store and maintain a tree, as well as how to manipulate and traverse that tree. That means my main model doesn't need to worry about those details at all. Same with "Indexed", my own full-text index implementation using Okapi BM-25 ranking. (I need to deal with multilingual setups and dynamic
[23:01:01] <GothAlice> content, so MongoDB's indexing won't pass muster.)
[23:04:20] <dsirijus> oh, like, schema inheritance?
[23:04:28] <GothAlice> There's some of that going on, too, yeah.
[23:04:40] <dsirijus> or schema "mixin"?
[23:04:51] <GothAlice> More like a mix-in, yeah.
[23:05:10] <dsirijus> good, good. all great advices. thanks, GothAlice! :)
[23:05:25] <GothAlice> Then I can test the Taxonomy code in isolation without the complexity of the rest of the system. :D
[23:06:02] <dsirijus> oh, this is it. we're not talking anymore. you have actual tabs in your source code
[23:06:10] <dsirijus> i don't speak with such heathens
[23:07:17] <GothAlice> Back from when we had a techblog: https://web.archive.org/web/20130813163534/http://tech.matchfwd.com/your-code-style-guide-is-crap-but-still-better-than-nothing/
[23:07:31] <GothAlice> I was master of the inflammatory blog title. ;)
[23:07:52] <dsirijus> :D
[23:10:28] <GothAlice> Relevant segment from the "Indentation" section of that article: "Do you use spaces in a word processor to line up bullet points? If you do you’ll be first against the wall when the revolution comes!" ;)
[23:12:46] <dsirijus> ok, ok. i'll give you the benefit of doubt.
[23:13:31] <GothAlice> Also, on Github, you can add "?ts=4" (or other replacement size) to render tabs to that size. Sadly this doesn't work on Gist, only main Github file views.
[23:14:08] <dsirijus> ok. will you stop with all that useful information already!?
[23:14:13] <GothAlice> :P
[23:17:06] <hephaestus_rg> hi i have a question about indexes
[23:17:24] <hephaestus_rg> if i query against a certain field frequently, then it makes sense to index it right?
[23:18:00] <hephaestus_rg> in my case, i have an ISODate field called 'published_at' that i query very often. is it worth indexing it then? (i have 400,000 docs in that collection)
[23:26:31] <joannac> hephaestus_rg: yes
[23:28:25] <GothAlice> hephaestus_rg: It's important to know how MongoDB uses indexes, however. MongoDB currently only uses one index to "cover" a query. Creating "compound" indexes with the fields in an appropriate order can make a world of difference. See: http://docs.mongodb.org/manual/core/index-compound/
[23:29:47] <GothAlice> Compound indexes can also save you from creating extra indexes, since MongoDB can use a compound index's prefix (first field, or first and second, or first through third, etc.) to satisfy queries using those earlier fields without needing additional indexes. But only if they're in the right order… it works from the left to right.
[23:31:15] <neo_44> hephaestus_rg: that is the rule for indexes....to be honest you dont' want to query data without an index if possible
[23:31:45] <GothAlice> hephaestus_rg, neo_44: Some queries, such as string regular expressions, however, can't use indexes. It's something to be aware of. :)
[23:33:21] <neo_44> GothAlice: depends on the regex
[23:33:59] <GothAlice> Indeed. For details, see: http://docs.mongodb.org/manual/reference/operator/query/regex/#index-use