PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 28th of January, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:11:05] <Freman> a backlog has started
[00:11:06] <Freman> http://pastebin.com/Dk98yiUu
[00:11:13] <Freman> ^ iotop
[00:17:41] <Freman> total/actual disk reads are showing 30+ mbit/second on both machines... the writes are sweet fa... occasional spikes to 5 mbit
[00:18:15] <Freman> why's bulk inserting causing so much reading?
[00:32:20] <Freman> yep still going on
[00:36:45] <Freman> db.currentOp()...
[00:36:47] <Freman> hahahahahahahahahaha
[01:29:10] <Freman> yay!
[01:29:46] <Freman> I've convinced the dev that caused this mornings issues to stop quering for unindexed fields on 217609296400 byte collections that don't even exist
[01:33:36] <cheeser> heh
[01:38:47] <GothAlice> How can they have size if they don't exist?
[01:47:58] <Freman> the collection exists... it just doesn't have the array field "ids", log_info does...
[01:48:10] <Freman> and as an added bonus log_info even has an index on it
[01:49:07] <cheeser> several problems in there
[01:50:23] <HarryKuntz> hwhat is mongoose?
[01:50:40] <cheeser> http://mongoosejs.com/
[01:51:08] <HarryKuntz> i read that but what is the point over vanilla mongo?
[01:51:28] <cheeser> i'm not a js guy so i really couldn't say
[02:04:50] <HarryKuntz> cheeser: what are you?
[02:16:34] <dman777> HarryKuntz: I don't really see a advantage w/ mongoose over vanilla mongo.
[02:17:06] <dman777> HarryKuntz: I've spent some time with mongoose and I think I prefer vanilla mongo. but I haven't spend enought time yet with vanilla mongo to say for sure.
[03:26:26] <SnarfSnarf> Hey all! I'm trying to build a room rental app as a class project and am having trouble thinking through the schema. I have a room schema that contains the location, an id and the name of the room. I;m having trouble thinking about how to integrate the schedule part. Any suggestions?
[05:43:59] <dman777> SnarfSnarfSnarf: if it's in the same context of the room, keep it as a single embeded document
[05:44:36] <SnarfSnarfSnarf> dman777: Wouldn't it quickly get out of hand to have it include everytime the room is reserved? Especially considering reservations up to a week in advance
[05:45:20] <dman777> SnarfSnarfSnarf: not if every room has it's own schedule
[05:45:31] <dman777> and keep it yearly
[05:45:55] <SnarfSnarfSnarf> So I'd have name, location, (all the dates where its taken?)
[05:46:02] <SnarfSnarfSnarf> and then append to that?
[05:46:04] <dman777> SnarfSnarfSnarf: but really, it sounds more like a relational database might be more appropriate in my opionion.
[05:46:43] <dman777> with small bits of fragmented data out of context.... rooms and schedules
[05:47:10] <dman777> but that's just in my view....others might have a better contrasting opinion
[05:47:29] <dman777> the whole point of mongo is to have less joins as possible, so you want data in the same context in single documents
[05:51:38] <SnarfSnarfSnarf> dman777: The idea behind using Mongo was in the future be able to change it to have general locations around say a city or state, with geo data, I figured Mongo's find would be really useful in that regard for say searching for rooms 8 km away that are open, I'm just really stumped on the open part
[05:52:48] <dman777> it sounds like a lot of relational data to me
[08:25:49] <alameda_> Hi, I'm having trouble with sorting. I have documents that looks like this: https://gist.github.com/daniel/ea24a0256e126f5b1def. But when I run .sort({order:1}) it sorts using the order field in the items array.
[08:52:21] <m3t4lukas> alameda_: you sure, it's somehow not possible :/
[08:53:07] <alameda_> you're right, I just discovered it doesn't sort on the order field in the items array
[08:53:25] <alameda_> but it doesn't sort right
[08:54:18] <alameda_> but it sorts on something, if I remove the sort() I get a different order
[08:55:34] <alameda_> oh I found it
[08:55:59] <alameda_> some of the field values were integer and some strings!
[08:56:22] <alameda_> like 1 and "2"
[09:13:42] <m3t4lukas> alameda_: hat might just have been the issue xD
[09:40:46] <daley_c> Hellom
[09:40:51] <daley_c> Anybody around?
[14:00:21] <litecandle> Is there a general rule of thumb when it comes to sorting an array of objects? e.g. is it better to do it server side instead of using aggregate to unwind and sort?
[14:01:30] <StephenLynx> that depends on your priorities.
[14:02:18] <StephenLynx> and how you intend to use the data you are reading
[14:02:46] <StephenLynx> if you want to run a db operation AFTER the sorting, doing that on application code won't cut.
[14:03:56] <litecandle> Hmm - well I have an array of user events that I'd like to sort by date. Thing is the user gets to add the events manually so it's unordered.
[14:04:29] <StephenLynx> you need to run db operations after the aggregation?
[14:04:41] <litecandle> Nope this is just for a view
[14:04:50] <litecandle> Well, an index page for the resource.
[14:05:18] <StephenLynx> IMO, sorting on application code is better.
[14:05:51] <litecandle> That's what I was leaning towards.
[14:07:54] <StephenLynx> it will be much faster than an unwind, sorting and grouping.
[14:11:58] <litecandle> Awesome, thank you!
[16:12:29] <echelon> hey
[16:12:42] <echelon> i keep getting compilation errors when trying to build mongo-tools
[16:12:44] <echelon> https://pastee.org/nnq24
[16:12:49] <echelon> can someone take a look?
[16:14:11] <echelon> go build -o bsondump -tags "ssl sasl" "bsondump/main/bsondump.go"
[16:16:19] <echelon> this is the build script i'm using.. https://slackbuilds.org/slackbuilds/14.1/system/mongo-tools/mongo-tools.SlackBuild
[16:34:07] <echelon> hi
[16:34:40] <echelon> i think the use of ISODate() is what's causing the exception.. BSON representation of supplied JSON array is too large: code FailedToParse: FailedToParse: Bad characters in value
[16:35:09] <echelon> how do i make a field using ISODate json-friendly?
[16:36:03] <echelon> "timestamp" : ISODate("2016-01-28T11:29:25.436Z")
[16:36:19] <echelon> using mongoimport ^
[16:40:57] <echelon> ok, i found this.. http://grokbase.com/t/gg/mongodb-user/1247c1x2q6/isodate-exception-shell-vs-mongoimport
[16:47:26] <echelon> GothAlice: thanks for your help yesterday, i think you unknowingly alluded to the solution yesterday
[17:00:36] <GothAlice> echelon: Well, it also didn't look like a JSON array: there was no preceding [ and trailing ] markers to indicate such. There was just objects ({}) that were comma separated.
[17:00:43] <GothAlice> I.e. the whole thing didn't look like JSON to me.
[17:00:55] <echelon> ah
[17:01:01] <echelon> thanks
[20:26:32] <alexi5> hello ladies and gentlemen
[20:30:52] <GothAlice> https://gist.github.com/amcgregor/cc94e321ea4f9976e5ea?ts=4 my brain is about to explode
[20:34:17] <alexi5> hmm.
[20:34:40] <cheeser> GothAlice: oof. that must've been fun to write and debug
[20:34:53] <GothAlice> It explodes pseudo-randomly, complaining that the i.$ projection is mis-matched.
[20:35:27] <GothAlice> I suspect the company evaluation (also an array) is somehow sometimes being evaluated first and reserving $ projection.
[20:36:03] <GothAlice> cheeser: This is the result of a higher-level abstraction, so things like those $ors are generated. ;)
[20:36:24] <GothAlice> The $and block is just standardized publication/retraction date range filtering.
[20:36:25] <cheeser> i was wondering. there are some (micro)optimizations to be made that suggest generation
[20:36:56] <GothAlice> I'm willing to take suggestions for optimization, though. I'm having to unroll this out of the abstraction due to a regression or three.
[20:37:05] <alexi5> i decide to start doing schema for my application. my previous relational schema that i design for the prototype had 20 tables which reduced to 5 collections when doing a composite document schema
[20:37:36] <cheeser> GothAlice: mostly just combining terms to simplify the query. optimization might be the wrong word.
[20:38:15] <cheeser> time to pick up the kid at the bus stop, though.
[20:38:29] <GothAlice> The 'k': 'use' and 'day' items, despite comparing the same value in this example, rarely actually are. Most are things like 7 days, 1 use.
[20:38:56] <GothAlice> (This is the query to identify the package to use with sufficient remaining applicable balance.)
[20:39:49] <GothAlice> Ah, but I see why my current failure is a failure. 'company' is actually just 'c'.
[20:40:26] <GothAlice> And boom: "OperationFailure: database error: Executor error: BadValue positional operator element mismatch" again.
[20:45:59] <GothAlice> And fixed. Doesn't use $ projection any more, it now explicitly requests the (badly structured) specific abstract field.
[20:48:19] <albertom> hi
[20:48:27] <albertom> i am trying to build mongo 3.2.1 from sources
[20:48:29] <albertom> but
[20:48:41] <albertom> it seems it tries to build the windows msi installer :(
[20:49:07] <albertom> File "/builddir/build/BUILD/mongo-r3.2.1/src/mongo/installer/msi/SConscript", line 89:
[20:49:07] <albertom> major_version = "%s.%s" % (mv[0], mv[1])
[20:49:08] <albertom> error: Bad exit status from /var/tmp/rpm-tmp.Iz9GCp (%build)
[20:49:08] <shaboobal> hey there. are there any obvious reasons not to use mongodb? i have a somewhat relational data model. is that a no-no?
[20:49:43] <albertom> is there a way to tell scons to build explicitly for linux?
[20:49:44] <albertom> o.O
[20:59:18] <GothAlice> cheeser: I was surprised you didn't comment on my use of an ObjectId as a field name. ;P
[21:00:37] <GothAlice> shaboobal: MongoDB has no concept of a join, so relational models are entirely application-orchestrated, thus not particularly efficient. Additionally, multi-record, multi-table transactional updates aren't really a thing in MongoDB without a lot of extra work (simulating two-phase commits).
[21:01:32] <GothAlice> shaboobal: Lastly, if you have a graph model, i.e. you are storing social connections, for the love of all that is holy put that in a real graph database, don't simulate a graph on either MongoDB or a relational database.
[21:02:05] <shaboobal> GothAlice: so perhaps it's not suitable for storing the core of my data model? i have users, users have sites, sites have pages, etc.
[21:02:31] <GothAlice> I happen to use MongoDB for nearly everything, except the graph thing for which I use Neo4j.
[21:03:12] <GothAlice> A certain level of relational-ness is acceptable in MongoDB, but you need to be careful to model your data on how it's accessed, queried, and used, not in some "preferred perfect structure".
[21:11:59] <shaboobal> GothAlice: got it. do you prefer duplicated embedding or flattening out into collections?
[21:12:09] <GothAlice> Depends on the situation.
[21:12:34] <GothAlice> The example I typically use are forums; replies to a thread are embedded in the thread, but the thread references the forum it's under.
[21:13:18] <GothAlice> (This makes displaying a thread two queries: get the forum details, then get the thread contents.) Arrays can be sliced during projection letting even the embedded replies be paginated.
[21:14:56] <GothAlice> http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html is a brief article I often link about this. :)
[21:17:22] <GothAlice> Embedding replies within the thread also saves on a number of other queries, though. No clean-up query to delete "related" data when deleting a thread, for example, as the replies are deleted with the thread they're contained in.
[21:17:58] <shaboobal> thanks that's helpful
[21:28:16] <GothAlice> shaboobal: Ah, an additional rationale for embedding: it effectively simulates a join. My forums have "permalinks" (technically an ObjectId) on every reply, but when jumping to show a specific reply (like the Twitter tweet card view of a single tweet) I naturally want to get the details of the overall thread at the same time, things like title, permissions, etc.
[21:29:19] <GothAlice> db.thread.find({"reply.id": ObjectId(…)}, {"title": 1, "reply.$": 1}) — load out a specific reply, regardless of thread or forum, getting the thread title and just the single reply out of the thread.
[21:48:50] <sterns> in node, if I do myCollection.remove(query, function(err, result) {...} result.n = 1, from what I can tell 'n' is `documents scanned`. Can I determine how many records were actually removed?
[21:52:07] <GothAlice> sterns: nRemoved
[21:52:16] <GothAlice> :)
[21:53:17] <sterns> result == {ok: 1, n: 1}, is there some trick to getting nRemoved?
[21:53:29] <sterns> GothAlice: ^
[21:53:32] <GothAlice> I'm not familiar with the JS driver, so it might be shortening nRemoved to just n for you.
[21:53:52] <GothAlice> The literal getLastError result, .toJSON()'d, gives me nRemoved as the key.
[22:07:29] <xissburg> http://hackingdistributed.com/2013/01/29/mongo-ft/
[22:10:17] <GothAlice> Note the date.
[22:10:30] <GothAlice> See also: https://blog.serverdensity.com/does-everyone-hate-mongodb/
[22:13:27] <GothAlice> xissburg: Was there a point to linking that article?
[22:15:14] <xissburg> I am just seeing a lot of people saying bad things about mongodb
[22:15:48] <xissburg> https://news.ycombinator.com/item?id=9912842
[22:16:44] <GothAlice> Yup. Most particularly vocal denouncements don't RTFM, or ignore what they read. :/ On the "it's slow" front, I can show benchmarks showing ~4 million record operations per second (technically 1.9 million distributed RPC calls, but there's a round-trip in there to save the return value) on a single host five years ago. ;)
[22:17:18] <GothAlice> Most of the comments on that ycombinator thread are non-technical rants. :(
[22:17:18] <xissburg> the biggest problem I am reading about is inconsistency, and actual failures, not performance
[22:17:58] <GothAlice> During our evaluation of it at work we spun up 1000 nodes in a complex replica set + sharding setup, loaded the DB until we hit IO limits (random operations), then started kill -9'ing random whole VMs.
[22:18:29] <GothAlice> It took Igor (yeah, that's really that engineer's name) killing ~65% of the hosts randomly before errors started showing up on the loading scripts.
[22:18:37] <GothAlice> That satisfied our demands for reliability. ;)
[22:20:37] <GothAlice> With more than 30 terabytes of data in MongoDB, in use for the last five or six years, I have not encountered a single actual failure, unrecoverable state, or inconsistent operation. So… the majority of these articles tend to show me unrealistic failure scenarios ("well, if you toggle the router just *so* you can get some weird behaviour") or a lack of reading the manual.
[22:21:49] <xissburg> heh
[22:23:43] <GothAlice> Even more heh, that uninterrupted operation is in the face of some recently hilarious Rackspace maintenance windows. Until recently, my nodes had a greater than three year average uptime, but even when nodes were being cycled, the overall cluster barely noticed.
[23:00:51] <shaboobal> thanks for all the help GothAlice
[23:01:06] <GothAlice> No worries. :)
[23:01:21] <shaboobal> one thing that's kind of a bummer is looking at managed services for postgres vs mongo - about 10 times more storage in postgres for the same price
[23:04:34] <GothAlice> Well, "cloud hosting" is typically a tax on those unwilling or unable to manage the services themselves.
[23:05:36] <StephenLynx> that
[23:05:37] <GothAlice> Case in point: buying 24 HDDs and three 8-bay iSCSI RAID arrays pays for itself in three months vs. nearly any bulk file storage service. MongoHQ, for my dataset, would cost me half a million USD per month.
[23:05:43] <StephenLynx> managed services are for suckers
[23:07:37] <GothAlice> Yes and no. The current cloud deployment infrastructure I'm using is actually cheaper than if I continued using my own.
[23:10:27] <GothAlice> There are occasionally services out there who actually pass along economy of scale instead of simply trying to rip you off. ;P