PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 20th of February, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:05:38] <jaitaiwan> Hey GothAlice and StephenLynx. Looks like you guys kept at it af ter I left haha
[03:55:21] <giowong> getting paid pretty good for not knowing anything/
[03:55:53] <cheeser> come again?
[03:56:27] <giowong> oops
[03:56:32] <giowong> anyone with mongoose experience?
[04:04:06] <Nepoxx> Anyone might have a clue why I'm getting a "Can't canonicalize query: BadValue unknown top level operator: $set" with Mongoose when trying to do an update?
[04:05:35] <cheeser> not a mongoose kinda guy but if you pastebin your code maybe I can spot something
[04:10:00] <Nepoxx> http://pastebin.com/gV63yCAd I just figured out that the 3 parameter function works fine
[04:13:15] <Nepoxx> cheeser, good luck if you'Re not used to Mongoose :P (but thanks!). I think this is a bug though, I'll probably file an issue on it
[04:15:14] <giowong__> hi
[04:15:14] <giowong__> so
[04:15:21] <giowong__> im trying to update a subdocument in a array
[04:15:30] <giowong__> im not sure if models script is correct
[04:15:32] <cheeser> Nepoxx: uh. yeah. sorry. :)
[04:15:34] <giowong__> here is the gist
[04:15:35] <giowong__> https://gist.github.com/gwong89/3f0be836ff53e5779f19
[04:15:39] <giowong__> using mongoose btw
[04:15:47] <cheeser> giowong__: try not to press enter until you finish a thought
[04:15:55] <giowong__> ok sorry
[04:16:16] <giowong__> getting a 500 error on terminal console
[04:16:42] <Nepoxx> You should look at the stack
[04:18:41] <giowong__> hm
[04:18:59] <giowong__> well i was just wondering if my mongoose logic is correct?
[04:20:25] <Nepoxx> I'm not sure I understand the "myemotions.this.timesUsed", but I am NOT a mongoose expert by any means
[04:21:03] <giowong__> so
[04:21:11] <giowong__> in my index.js
[04:21:33] <giowong__> req.body is a element of the array
[04:22:08] <giowong__> myemotions is the array name in the schema
[04:22:33] <giowong__> i wanna update the passed element that is a subdocument and update the timesUsed attribute
[04:23:07] <Nepoxx> The first thing that comes to mind is that you're overwriting the update function (update is already a function of your model)
[04:24:03] <giowong__> o shoot
[04:24:16] <giowong__> i didnt know that update was a native function in mongo
[04:24:20] <giowong__> let me change that and see if it works
[04:24:24] <Nepoxx> It's not a good idea, but I don'T think it's going to be you'Re only issue
[04:24:30] <Nepoxx> Yeah, keep me updated
[04:25:29] <Nepoxx> The other thing is that "myemotions.this.timesUsed" doesn't really make sense, myemotions is not going to be defined in your update function (or whatever you name it)
[04:25:51] <Nepoxx> it should be more like `this.myemotions.timesUSed += 1` instead
[04:26:57] <Nepoxx> Unfortunately it's very late and I'm tired, good luck with your issue giowong__
[04:27:04] <giowong__> no problem!
[04:27:06] <giowong__> thanks anyways
[04:41:25] <MacWinner> anyone try tokumx here? any comment on the results?
[07:47:20] <iksik> hm, is it possible to sort whole collection by nested field?
[08:36:36] <coudenysj> Hi all, anyone that could help me solve a query problem?
[08:37:21] <kali> coudenysj: just state your question
[08:38:06] <coudenysj> I actually created a stack overflow question for it (http://stackoverflow.com/questions/28610445/how-do-i-get-a-list-of-mongodb-documents-that-are-referenced-inside-another-coll) so I can add data examples to it
[08:39:46] <kali> well, you're basically asking for JOIN
[08:40:06] <coudenysj> yes
[08:40:35] <kali> and you won't get them :)
[08:40:40] <kali> so.
[08:40:50] <coudenysj> i know
[08:40:52] <jaitaiwan> Agregate?
[08:41:13] <coudenysj> the problem is that aggregate will only allow me to query one collection
[08:41:18] <kali> map/reduce and the A/F will not work because there are two collections involved
[08:41:30] <kali> well, map/reduce could, but it's cumbersome
[08:42:40] <kali> so a more practical option is denormalization : just mirror the bus/user relation in the user table, and it becomes easy
[08:43:38] <coudenysj> that is indeed an option, but I think this could get me into trouble in the future
[08:43:47] <coudenysj> writes that do not complete, etc...
[08:44:53] <coudenysj> for now I'm using the distinct+ where $in combination
[08:45:07] <jaitaiwan> Couldn't you just associate the relation in reverse? A user can have multiple businesses?
[08:45:16] <coudenysj> but like I said on StackOverflow, this will get very large in the future
[08:45:21] <kali> thing is... if you want a relational database, just use one
[08:45:40] <coudenysj> @jaitaiwan same problem occurs if I want businesses with users related
[08:46:01] <coudenysj> @kali this is the only use case I have that requires joins
[08:46:18] <jaitaiwan> https://www.youtube.com/watch?v=vqgSO8_cRio
[08:46:39] <coudenysj> the data i'm storing in the business documents would be problematic in relational dbs
[08:48:43] <jaitaiwan> @coudenysj: from the youtube video re where the business vs users relation is stored: "why not have both?"
[08:50:57] <coudenysj> like I said, I think this will get me into trouble in the future when writes fail etc.., but I will start to use it, I think :)
[08:53:33] <jaitaiwan> Probably safer to store the relation separately if you're concerned about writes
[08:53:46] <jaitaiwan> and you can force confirmation too
[09:01:03] <coudenysj> I'll have a look at the "double relation" solution, thanks @kali & @jaitaiwan
[10:21:19] <cers> is there a way to control the format of Date objects in mongoexport? I need it to be a unix timestamp instead of something like 2014-12-04T13:23:55.000Z
[11:18:03] <ra21vi> how can I search all document containing exact array items. For ex searching for A, B would result this doc { participant: [A, B] }, since it exactly have both item in field
[11:35:40] <arvydas_> hello, if i drop a collection, why does sharding is not removed?
[12:10:14] <arvydas_> how to drop sharded collection together with sharding? if i drop a collection and add collection with same name - sharding reappears
[12:12:03] <coudenysj> ra21vi: http://stackoverflow.com/a/6165143/213624
[12:43:06] <arvydas_> after dropping collection, db.collection.stats() returns: "sharded" : true....
[13:28:24] <no-thing_> how to drop a collection with its config information? (drop totally, so no information about sharding remains) ?
[13:29:02] <flok420> i have this data: http://pastebin.com/XnNaQ75W structure. how do I retrieve the ts/value pairs from all documents with "data_source" == "meminfo" and ts > 123 ?
[13:31:12] <coudenysj> flok420: use the aggregation framework
[13:34:26] <flok420> coudenysj: I tried that but that give an error I don't understand: http://pastebin.com/8iUR57AX
[13:35:20] <coudenysj> flok420: try to $unwind the param_data first
[13:38:11] <flok420> I added { $unwind : "$param_data" }, but it gives the same error (gt requiring 2 parameters)
[13:42:55] <joannac> if : {$gt: ["$ts" , 0] }
[13:43:16] <no-thing_> how to remove any sharding information on collection that was dropped?
[13:43:33] <joannac> just like the docs http://docs.mongodb.org/manual/reference/operator/aggregation/redact/#exclude-all-fields-at-a-given-level
[13:43:56] <joannac> no-thing_: ? just drop it?
[13:44:06] <no-thing_> i dropped it
[13:44:12] <no-thing_> but
[13:44:18] <cheeser> wrote my first few aggregations "in anger" yesterday :)
[13:44:41] <no-thing_> collection.stats returns sharding : true
[13:44:43] <joannac> cheeser: slacker
[13:44:53] <no-thing_> and if i create collection with same name
[13:44:59] <no-thing_> the sharding is back
[13:45:01] <no-thing_> and functioning
[13:46:22] <no-thing_> tried to flushconfig on meta server, restarting , etc
[13:49:24] <flok420> joannac / coudenysj: works! thanks!
[13:50:27] <joannac> no-thing_: um, i can't reproduce this
[13:50:48] <joannac> dropping the collection and recreating it doesn't magically make it sharded again for me
[13:50:59] <no-thing_> http://pastebin.com/zk6wXuzE
[13:51:06] <no-thing_> thats why i can't wrap my head around
[13:51:20] <no-thing_> after dropping collection
[13:51:26] <no-thing_> this is the output of .stats
[13:51:36] <joannac> what version?
[13:51:40] <no-thing_> 2.6.7
[13:52:40] <joannac> that's really weird
[13:53:32] <joannac> have you checked all your shards to make sure the collection is not there?
[13:53:53] <no-thing_> yes
[13:54:33] <no-thing_> tried restarting everything
[13:57:44] <joannac> what does config.collections say?
[13:59:26] <joannac> does it show up in sh.status()?
[14:01:02] <StephenLynx> cheeser what do you mean by "in anger"?
[14:02:56] <cheeser> StephenLynx: using it for an actual use case an not just recreating an example aggregation.
[14:03:17] <cheeser> e.g., i've written a few in morphia working out that API but they were simply recreations of ones that asya had written.
[14:04:27] <StephenLynx> I never avoided aggregations.
[14:04:54] <StephenLynx> Is much easier to do sorting, limiting and skiping with them rather than using methods.
[14:05:00] <StephenLynx> at least in node/io
[14:05:00] <cheeser> i havne't avoided. just hadn't needed them.
[14:05:27] <cheeser> though i've been meaning to add a new module to my irc bot for creating channel stats using aggregation
[14:05:36] <cheeser> just Too Much to Do
[14:17:15] <flok420> is it possible to suppress _id in the output of an aggregate call?
[14:18:41] <StephenLynx> yes.
[14:18:48] <StephenLynx> just put _id:0 on the projection block.
[14:21:06] <no-thing_> thanks joannac mentioning config db, just cleaned from there and now everything works fine
[14:28:47] <flok420> ok I've added { $project : { '_id' : 0, param_data : 1 } } and that gives indeed the output wanted, e.g only ts/value. unfortunately this doesn't solve the "spec must be an instance of dict" of python I get
[14:38:06] <hashpuppy> can you have a replica set where one node is 64-bit and the other 32-bit?
[14:39:00] <cheeser> i wouldn't advise it
[14:39:53] <cheeser> http://docs.mongodb.org/manual/faq/fundamentals/#what-are-the-limitations-of-32-bit-versions-of-mongodb
[14:40:27] <mattyw> hey folks, quick question. If I set an Index with DropDups: true - I can't see anything that says when the duplicate is dropped, it looks like it will happen "at some point" so it's possible for duplicates to exist in the collection for a small amount of time?
[14:41:04] <cheeser> they'll be gone when the index is done building.
[14:42:24] <hashpuppy> thanks, cheeser
[14:43:23] <mattyw> cheeser, isn't the index only built once - so I can write the index to dropdups and then insert duplicates?
[14:43:44] <mattyw> cheeser, I'm just trying to work out the exact differences between dropdups and unique
[14:44:08] <mattyw> cheeser, I did read the docs - but it didn't seem to spell it out as clearly as I'd hoped
[14:47:00] <cheeser> insertions will fail with a constraint violation
[14:47:13] <cheeser> dropdups only makes sense at creation time.
[14:47:33] <cheeser> if you have existing data and now you want to make a field unique, what do you do to duplcates?
[14:48:01] <cheeser> one option, the default, is to fail the index build. the other option is to delete the dupes when you're building the index.
[15:05:21] <flok420> I converted this http://pastebin.com/scD481Xs into this http://pastebin.com/jmCQXxFx The original works fine in the mongo commandline client, but the latter returns zero records. What could be the cause?
[15:27:33] <GothAlice> So, marrow.task is back on the table. SERVER-15815 be damned. The wait timeouts just snap to the next largest multiple of ~2.5 seconds instead of being precise. Ah well. ^_^
[15:27:55] <GothAlice> *evar
[15:28:24] <Zelest> evah!
[15:28:35] <Zelest> fo evah n' evah!
[15:29:52] <freeone3000> I'm having a problem bringing secondaries in a repl set back up after a changeover. My error is "[repl prefetch worker] uh oh: 90". I can't seem to find documentation on this. Any suggestions?
[15:35:03] <saml> http://stackoverflow.com/questions/25544528/migration-errors-in-cluster
[15:36:24] <saml> https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/sync_tail.cpp#L111
[15:37:18] <saml> https://github.com/mongodb/mongo/blob/master/src/mongo/db/index/index_descriptor.cpp#L89
[15:37:49] <saml> freeone3000, that means _magic is not 123987
[15:37:54] <saml> your magic is gone. sorry
[15:38:08] <GothAlice> Big bada boom.
[15:38:21] <saml> where is _magic defined?
[15:38:56] <GothAlice> Likely in the C struct header for the on-disk index data structures.
[15:39:06] <saml> https://github.com/mongodb/mongo/search?utf8=%E2%9C%93&q=_magic
[15:39:18] <saml> so sometimes magic is not 123987
[15:40:08] <GothAlice> Luckily, with a replica set, fixing corrupted on-disk structures is as simple as nuking the dead node and recreating it.
[15:40:15] <GothAlice> Let replication sort 'em ou.
[15:41:28] <saml> freeone3000, what's changeover that you did?
[15:43:28] <freeone3000> saml: We had a server fail due to lack of disk, which since we deploy all the servers the same, caused it to replicate to every other server, filling up their disks, and causing them to fail due to lack of disk.
[15:43:58] <freeone3000> saml: So we've been migrating to much larger storage drive. (20TB instead of 1TB)
[15:44:28] <GothAlice> Uhm. That's bad news bears. Running out of disk == corrupt data, often.
[15:44:59] <GothAlice> Hopefully an index is the only casualty, but you'll need to run a --repair across it for sure.
[15:45:00] <nobody18188181> in mongodb 2.6, how do I get the lock percentage?
[15:45:43] <freeone3000> GothAlice: Yeah, did that. In the middle of that, I'm getting the "uh oh 90".
[15:45:54] <saml> using 2.6 version?
[15:47:11] <saml> maybe open a ticket. scary situation to be hin
[15:47:34] <GothAlice> freeone3000: Well, that node is borked. Next is to hope that at least one isn't, and you can re-replicate off that one.
[15:47:37] <nobody18188181> in mongo 2.6; db.serverStatus()["globalLock"] ratio is gone. Where do I get this information now?
[15:47:45] <freeone3000> GothAlice: Okay, so just wipe the data and have it continue with recovery?
[15:47:53] <freeone3000> GothAlice: Err, rather, have it recover by replicating?
[15:47:59] <GothAlice> freeone3000: Yes.
[15:48:21] <saml> how do you wipe out data? rm data-dir ?
[15:48:34] <saml> /dev/mapper/mongogrp-datavol 197G 95G 92G 51% /var/lib/mongo
[15:48:37] <saml> i think we're okay
[15:48:45] <StephenLynx> lol backblaze does not support linux GothAlice
[15:48:51] <StephenLynx> how do you get around that?
[15:48:53] <GothAlice> There's usually no benefit to bashing against a failing replica secondary, just nuke it and have it re-replicate. (Certain datasets this is ill-advised for, i.e. ones with massive write loads, the node may never catch up, or where you have extremely large amounts of data.
[15:49:05] <GothAlice> StephenLynx: Mount my zfs volume snapshot on my Mac. ;)
[15:49:12] <freeone3000> See above, 1TB data space exceeded.
[15:49:25] <StephenLynx> ugh
[15:49:25] <GothAlice> freeone3000: I have 24 TiB. 1 TiB is tasty small. ;)
[15:49:29] <freeone3000> Ah, okay.
[15:49:41] <StephenLynx> yeah, nah. I guess I won't be going anywhere near backblaze
[15:49:43] <saml> i got 95GB
[15:49:50] <GothAlice> Er, 26TiB. 24 a few months ago. Old habits. ^_^
[15:49:59] <saml> that's big data
[15:50:13] <saml> what do you store?
[15:50:16] <GothAlice> Everything.
[15:50:22] <saml> every single http request?
[15:50:27] <GothAlice> Every TCP packet.
[15:50:32] <GothAlice> A copy of Wikipedia.
[15:50:41] <saml> is this for personal use?
[15:50:42] <StephenLynx> she hoards data.
[15:50:48] <saml> or NSA?
[15:50:52] <GothAlice> 350,000+ books, 170,000+ songs, …
[15:51:05] <saml> oh are you mega.com ?
[15:51:11] <GothAlice> Heh. i have no need for mega. :P
[15:51:30] <GothAlice> It's my Exocortex. It does NLP and AI researchy stuffs across every bit of digital information I have touched since 2001.
[15:51:34] <saml> so you have 24TB harddrive in yoru basement
[15:51:41] <GothAlice> Well, earnest thing is Sept. 2000, but.
[15:51:43] <saml> wow
[15:51:59] <GothAlice> saml: Three Drobo 8-something-i rack-mount arrays + 3 1U Dell boxen.
[15:52:20] <GothAlice> *earliest
[15:52:34] <saml> mongodb wasn't around in 2001
[15:52:39] <GothAlice> No it wasn't. :)
[15:52:43] <saml> so i guess you migrated
[15:52:50] <saml> that's amazing hobby
[15:53:06] <saml> or job
[15:53:08] <GothAlice> I also added full text indexing and boolean filtered Okapi BM-25 ranking, plus field-level compression to MongoDB 6 years ago.
[15:53:13] <GothAlice> ¬_¬
[15:53:42] <saml> i have no idea what those are. but probably you get high electricity bill
[15:53:47] <saml> unless you have generator as well
[15:53:48] <GothAlice> saml: Free. :)
[15:53:59] <StephenLynx> canada. NA wonderland.
[15:54:19] <GothAlice> Well, no, mostly it's due to a rental contract that was badly written on the part of the rental agency. ;)
[15:54:46] <GothAlice> Electricity included, but not added to cost of rent = I can exploit that. >:3
[15:55:11] <StephenLynx> I used to value data. then after getting pissed off about having to care about it I just said "fuck it" and I just backup my browser bookmarks and RSA key.
[15:55:12] <saml> probably you have one of the biggest mongodb installation in the world
[15:55:24] <GothAlice> saml: Ha, no.
[15:55:33] <saml> i thought my 95GB was big and hard to manage
[15:55:40] <StephenLynx> 26tb is a lot for a person, but is kind of small when you get to industrial levels.
[15:56:18] <GothAlice> (And remember, with xz/lzma compression, text like HTML pages compress to < 10% their original size… uncompressed my dataset is a *lot* larger!
[15:57:11] <saml> so you have no index? just key-val where value is compressed html files?
[15:57:25] <saml> or did you publish some interesting reports, aggregation, analysis?
[15:59:10] <GothAlice> saml: Nah. There's the GridFS blob storage, compressed. There's the de-duplicated full text index, i.e. storing unique combinations of root words. There's the double and triple word association pairs. The tag neural network for predictive association and the tag synonym lists. NLP for tag prediction. And a metadata collection to store extracted data like EXIF, ID3, etc. It's non-heirarchical, so most searching is done by tag.
[15:59:55] <GothAlice> Plus processing pipelines to do content extraction from different media types, perform computer vision for facial recognition in photos/videos, etc., etc., etc.
[16:00:42] <StephenLynx> whoa
[16:01:00] <saml> okay, i nominate you to webscale expert of the month
[16:01:11] <StephenLynx> I wonder if that many people were the same person.
[16:01:18] <saml> if you blog, you'll get followers and make money
[16:01:20] <GothAlice> I can ask it "what's the first word you think of when I say 'purple'" and it'll likely answer "eggplant". ;P
[16:01:55] <GothAlice> saml: :P
[16:01:59] <StephenLynx> GothAlice is that system's code disclosed?
[16:02:11] <GothAlice> It's patent encumbered by about 115 patents… so no, not open. T_T
[16:02:22] <StephenLynx> you patented it?
[16:02:44] <saml> what's your mongo configuration like?
[16:02:45] <GothAlice> No, without realizing it several of the behaviours I was implementing (NLP for tag prediction being one) were already encumbered.
[16:02:52] <saml> just a replicaset? no sharding? mongos?
[16:03:24] <GothAlice> Two different process pools, one for blob storage, the other for indexed metadata querying. (Needed to isolate memory growth of the blob storage processes.)
[16:03:33] <GothAlice> And yeah, pure replicaset.
[16:03:55] <StephenLynx> wait, so people patented algorithms and you can't implement the concept itself?
[16:04:09] <GothAlice> You can implement to your hearts content. You just can't share without paying up.
[16:04:20] <StephenLynx> that is straight up bullshit.
[16:05:55] <GothAlice> https://github.com/marrow/contentment/blob/develop/web/extras/contentment/components/search/model.py#L37-L112 is my original search ranking algorithm, BTW. (Okapi BM-25, the same that Yahoo! and Lucene use.) https://github.com/marrow/contentment/blob/develop/web/extras/contentment/components/asset/model/index.py is the index storage model.
[16:06:24] <GothAlice> https://en.wikipedia.org/wiki/Okapi_BM25 < describes the algorithm
[16:07:51] <GothAlice> Technically covered by https://encrypted.google.com/patents/US7925644
[16:08:41] <GothAlice> Owner name: MICROSOFT CORPORATION, WASHINGTON — damn you, Microsoft!
[16:09:04] <StephenLynx> I might try and release implementations out of spite
[16:15:07] <mrmccrac> i know mongo is supposed to use all the possible ram it kind on the server, but has anyone seen it invoke oom-killer after using up all possible memory?
[16:15:10] <mrmccrac> it can*
[16:15:28] <mrmccrac> or rather oom-killer invoked killing it
[16:15:34] <GothAlice> mrmccrac: Generally, no. MongoDB uses memory mapped files, so the kernel is given free leeway to load and unload chunks as it wishes.
[16:15:46] <mrmccrac> GothAlice: this is with 3.0rc8 wiredtiger
[16:16:05] <GothAlice> However, the oom-killer under situations where _other_ processes are using too much RAM, might prioritize killing the process with the largest allocated pool.
[16:16:46] <GothAlice> Well, I can't really speak to wiredtiger behaviour. :/
[16:17:05] <mrmccrac> let me see if it has its own set of memory options..
[16:17:15] <mrmccrac> i was doing a very high amount of queries/writes
[16:17:46] <GothAlice> Chunks of memory should still only be ephemerally locked from swapping out, so I still can't really believe MongoDB itself would be triggering oom-killer.
[16:19:17] <mrmccrac> its like its keeping every doc i query about/write in memory and never releasing them
[16:19:41] <mrmccrac> ya its definitely the shard processes eating gigs and gigs of mem
[16:20:09] <GothAlice> mrmccrac: Well, the best I can recommend is to grab a spindump/trace while it's churning away and filing that as a private ticket (along with a description of the problem) on JIRA. When in doubt: ask the developers. :)
[16:20:23] <GothAlice> mrmccrac: RSS or VSZ?
[16:20:43] <mrmccrac> RES
[16:21:00] <nobody18188181> is it possible to get the runtime configuration options from mongo shell?
[16:21:38] <mrmccrac> so actual physical memory
[16:22:07] <GothAlice> mrmccrac: My kingdom for BSD's "ps" command… it has a much more descriptive breakdown of memory usage.
[16:36:50] <mrmccrac> storage.wiredTiger.engineConfig.cacheSizeGB
[16:36:53] <mrmccrac> will look at tuning that..
[16:37:00] <mrmccrac> although by default it should only use half of physical ram
[16:48:03] <jrbaldwin> can anyone help with this error i'm having with mongo/mongoose http://stackoverflow.com/questions/28621931/mongo-traverse-error-when-updating-object-inside-nested-array
[16:48:19] <jrbaldwin> everyone says the query is right but i'm still having that traversal error
[16:49:17] <StephenLynx> try without mongoose.
[16:57:46] <mattd_> hey guys, quick ques if anyone has a sec: im just looking to maintain a relationship between 2 documents, so i have a ref object id in each, pointing to the other. there's no transaction support in mongo, so what would be the best way to ensure i can update each document with the reference to the other?
[16:59:17] <mattd_> following the suggestions here: http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/ my best bet?
[16:59:48] <GothAlice> mattd_: http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html
[17:00:10] <mattd_> GothAlice: haha thanks ill take a look
[17:01:00] <bros> Anybody here?
[17:03:38] <StephenLynx> aye
[17:04:56] <mattd_> GothAlice: im trying to link 2 root documents to eachother, nesting them within eachother isnt an option,.. looking to do a many to many
[17:05:40] <GothAlice> mattd_: Looks like two-phase is your solution, then.
[17:06:07] <GothAlice> mattd_: As a note, if you're storing graphs, it's generally better to use a real graph database. (Right tool for the right job.)
[17:06:20] <mattd_> GothAlice: yea I hear you, might be headed that way
[17:06:25] <mattd_> thanks for the help
[17:09:16] <nobody18188181> is it possible to get runtime configuration settings? like in mysql how you can get current settings?
[19:02:48] <MacWinner> any suggestions on CMS's in php that are driven by mongodb?
[19:03:02] <MacWinner> i'm standardizing everything on mongo
[19:07:12] <StephenLynx> I just use the driver on io.js.
[19:07:39] <StephenLynx> oh a CMS is another thing.
[19:07:46] <StephenLynx> like, wordpress?
[19:08:24] <MacWinner> StephenLynx, yeah.. wordpress. or drupal.. actually something that may be commerce focused would be cool..
[19:08:39] <StephenLynx> now that would be a very narrow search, if you ask me.
[19:08:43] <MacWinner> i just need a portal system that can have users register .. pay for services etc
[19:09:07] <StephenLynx> not only you want something that uses mongo that that also uses PHP and you just have to deploy.
[19:09:24] <StephenLynx> there is a CMS that uses mongo, but it is built with node, if I'm not mistaken
[19:09:36] <MacWinner> calipso?
[19:09:48] <MacWinner> i'm open to node as well.. planning on moving there in the future
[19:12:35] <StephenLynx> it outperforms PHP by miles and you are starting something new.
[19:13:01] <StephenLynx> http://keystonejs.com/ theres this one too
[19:13:42] <StephenLynx> been working with it for a while now, just moved on to io.js. work wonders with mongo.
[19:14:02] <MacWinner> you switched from keystone to io.js?
[19:15:03] <MacWinner> oh.. io.js is a node replacement
[19:15:09] <cheeser> fork
[19:15:12] <cheeser> but yeah
[19:20:47] <GothAlice> MacWinner: I'm working on a cMS at work. It's Python, though.
[19:21:18] <GothAlice> cMS having a lower c intentionally: it's a component management system. Acting like a CMS (content) is just one of the things it does.
[19:55:00] <kirby> so I'm trying to access serve data from a javascript program - connected to the server fine - but none of the .js commands mongo has in their documentation are recognized as being methods in the db class
[19:55:02] <kirby> any ideas?
[20:51:18] <MacWinner> i see some solutions on used capped collections and tailable cursors for pubsub type of functionality. Is there way to do pubsub where only one subscriber gets a message at a time? If i have 4 servers which have subscribers listening, I only want 1 of them to process at a time.. am I going to need the subscriber to somehow lock the document?
[20:51:28] <MacWinner> any pointers to example implementation of this?
[20:52:27] <cheeser> use findAndModify to grab a document out of the collection and mark it as "in process" or something.
[20:52:39] <cheeser> each accessing thread would a different document then
[20:54:08] <MacWinner> cheeser, thanks!
[20:56:25] <MacWinner> cheeser, after the task is complete, is it the task's responsibility to delete teh document?
[20:57:20] <cheeser> yep
[20:57:31] <cheeser> or a janitor thread
[20:57:35] <GothAlice> MacWinner: You could add a random integer field to each message in the capped collection. Each subscriber can then use a query on the tailing cursor which selects for that field modulo the number of works == that worker's ID.
[20:57:48] <GothAlice> s/works/workers
[20:58:48] <GothAlice> MacWinner: It actually sounds like you're working on something very similar to what I'm doing right now: https://gist.github.com/amcgregor/4207375
[20:59:40] <GothAlice> cheeser: findAndModify is dubious. The docs state you get the record prior to any modifications you define, and my driver warns me that it's deprecated. :/
[21:00:09] <GothAlice> MacWinner: I also use time-to-live indexes to have MongoDB automatically prune old data.
[21:00:28] <GothAlice> (So no need for manual clean-up, nor extra threads, etc.)
[21:00:41] <MacWinner> GothAlice, i see.. but if one of the workers is dead or busy processing, then the next thing won't get processed?
[21:00:43] <cheeser> GothAlice: it works. we use it here and i've used it past gigs.
[21:01:06] <fewknow> BOOM
[21:01:45] <GothAlice> MacWinner: My use case (and example link above) has each subscriber with a thread pool which queues up jobs to work on. Also, if you don't have process monitoring that can restart dead workers, you have more problems than having jobs skipped.
[21:02:02] <GothAlice> s/more/bigger
[21:02:08] <fewknow> MacWinner: why can't you use an actual Message service rather than hacking Mongo?
[21:02:57] <GothAlice> fewknow: "hacking Mongo" — the code I linked can process 1.9 million distributed RPC calls per second (benchmarked with two producers, five consumers). MongoDB uses capped collections internally to handle replication. So yeah. Less of a hack.
[21:03:09] <MacWinner> fewknow, i was trying to standardize on mongo as much as possible.. i have redis setup which has some nice pubsub stuff as well
[21:04:19] <GothAlice> MacWinner: If you include worker start/stop messages on that message bus, then they can even automatically reconfigure themselves during runtime.
[21:04:24] <MacWinner> GothAlice, isn't findAndReplace atomic?
[21:04:27] <fewknow> GothAlice: it is a hack...you are turning a database into a messaging service. Why reinvent the wheel? Could you not use a queue/message service to get the same results?
[21:04:30] <Chipper351> I am looking for someone I can talk with that has experience with setting up MongoDB over Multi-Sites and Multiple Locations (All Active). I am having trouble putting together a picture of how this would work and would really appreicate if someone would able to talk for a few minutes.
[21:04:40] <GothAlice> fewknow: It's already a messaging service.
[21:05:29] <GothAlice> fewknow: http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/ for the API, http://docs.mongodb.org/manual/core/replica-set-oplog/ for MongoDB's own use of it.
[21:05:46] <medmr> thats a little like saying android is a web browser... i mean tools can be overkill/too generic for a job at hand
[21:06:03] <fewknow> a tailable cursor doesn't make a messaging service
[21:06:53] <GothAlice> That's pretty much the only capability you need to implement one, so I don't grok the distinction you are trying to make fewknow.
[21:07:50] <GothAlice> The fact that tailable cursors are standard find queries makes it even better.
[21:08:17] <fewknow> I would argue a full messaging service has management of failure, time delay messages, orhter functionality with a complete service. Using a database with a tailable cursor is not a serice.
[21:09:36] <fewknow> it all you need is to tail a file then you could us tail -f on a text file and just write to it
[21:10:13] <MacWinner> if all you need is tail -f, then just write to mongo and use tailable cursor :)
[21:10:21] <fewknow> I wasn't saying you can't use mongo for a PUB/SUB...i just think there are better built tools to use already
[21:10:26] <MacWinner> then you can tail -f across servers!
[21:10:39] <GothAlice> Management of failure (i.e. watchdog keeping workers alive) is outside the scope of the message bus itself. Time delay is pretty easily implementable. (The RPC system I linked allows scheduling tasks at arbitrary times.)
[21:11:32] <MacWinner> GothAlice, what was the issue with findAndModify you mentioned?
[21:11:38] <GothAlice> "Better built" is highly subjective. And just because something exists doesn't mean it's automatically a better solution. For light-weight messaging, one doesn't need third-party tools at all.
[21:11:53] <GothAlice> MacWinner: The docs themselves state that you get the original, unmodified document back. That limits the usefulness.
[21:12:07] <bybb> Hey all!
[21:12:10] <GothAlice> (Esp. if you have a naive ORM/ODM that re-saves everything on each save. That would nuke your prior change.)
[21:12:55] <MacWinner> cool, thanks
[21:13:03] <bybb> I have weird results with this query db.orders.find({}, {state: {$slice: -1 }})
[21:13:17] <GothAlice> https://github.com/marrow/task/blob/develop/marrow/task/message.py is an example of a message bus model, with message schemas that make sense for RPC, as an example.
[21:13:46] <bybb> Could you tell me what it is suppose to do? Because ti doesn't do what I thought.
[21:13:59] <GothAlice> bybb: http://docs.mongodb.org/v2.6/reference/operator/projection/slice/
[21:14:13] <GothAlice> Negative values indicate "first N values" should be returned.
[21:14:30] <bybb> GothAlice yeah I know
[21:14:32] <GothAlice> Er, I mean last.
[21:14:42] <bybb> Well it's a projection
[21:15:02] <GothAlice> Is state an array?
[21:15:08] <GothAlice> If no, then that query won't do much.
[21:15:09] <bybb> And I get all the document fields but juste the last item of state
[21:15:18] <GothAlice> Ah, yes.
[21:15:29] <bybb> I would like to just get the last item of state array
[21:15:55] <bybb> I don't know if you get it when I write it like this
[21:16:14] <GothAlice> bybb: Don't paste code into the channel. Gist/pastebin it.
[21:16:38] <bybb> It's just one line, is it bad?
[21:17:29] <GothAlice> bybb: I thought you were preparing to paste more.
[21:18:25] <bybb> no really, it's just that line of code
[21:19:06] <bybb> Well it would be more like this http://pastebin.com/Ad2THump
[21:19:27] <bybb> So I don't get the behavior
[21:19:56] <bybb> Shouldn't I get just the last array's item and nothing else?
[21:20:02] <MacWinner> cool.. thanks for all the tips! very appreciated
[21:21:31] <MacWinner> fewknow, stumbled on this article: https://blog.serverdensity.com/replacing-rabbitmq-with-mongodb/
[21:22:15] <bros> An item has one or many barcodes. How can I model this with Mongo? I have barcode_id and item_id.
[21:22:23] <bybb> Well apparently it's the normal behavior https://www.safaribooksonline.com/library/view/mongodb-the-definitive/9781449344795/ch04.html
[21:23:08] <GothAlice> MacWinner, fewknow: The presentation I gave on using MongoDB as dRPC, that code replaced RabbitMQ and ZeroMQ. Additionally, on that project, MongoDB replaced Memcache/Membase (using TTL indexes) and Postgres.
[21:23:23] <GothAlice> When I joined the project, yes, they really were using all of that—and more.
[21:23:48] <MacWinner> yeah.. i just need to simplify my life.. we have a small team and don't want to manage too many components
[21:24:28] <MacWinner> or load is not high and grows at a controlled pace
[21:24:31] <GothAlice> MacWinner: Just remember to give yourself room to grow (and have monitoring!) on the size of the capped collection.
[21:24:55] <GothAlice> For bachelor we pre-allocated an 8GiB capped collection to handle a theoretical one million simultaneous users playing the game.
[21:25:27] <GothAlice> (With a requisite one week history being preserved.)
[21:30:29] <GothAlice> MacWinner: The SD blog post's final point about scaling is avoided in marrow.task because I store the (locking) task data in a normal collection already, and use the capped collection purely for messaging.
[21:31:00] <MacWinner> GothAlice, i see.. cool.. thanks!
[21:54:44] <blizzow1> Anyone here have experience integrating solr/elasticsearch/sphinx with mongo? Everything I see seems to mention mongo-connector and all the articles about that seem to say it's replicating information. I don't really want to duplicate 1.3TB of mongoDB data just to search quickly. Am I misintrepreting the scope of mongo-connector's replication?
[22:53:33] <jrbaldwin> anyone know where to start looking to solve this issue? it feels like a write collision but i'm not sure:
[22:53:34] <jrbaldwin> http://stackoverflow.com/questions/28639492/node-mongoose-res-send-erases-previous-mongo-update