[02:48:24] <morenoh149> what's wrong with the way I'm making my index? https://gist.github.com/morenoh149/cd2f44954ced3c0ea4f8#file-client-js-L58
[02:50:07] <cheeser> in the shell what does "db.rooms.getIndexes()" show?
[02:50:59] <joannac> morenoh149: you didn't specify ascending / descending
[02:51:21] <joannac> read the docs and look at the examples http://docs.mongodb.org/manual/reference/method/db.collection.createIndex/#examples
[02:51:51] <morenoh149> `collection.createIndex('a', {w:1}, function(err, indexName) {` is what I was following http://mongodb.github.io/node-mongodb-native/1.4/api-generated/collection.html?highlight=index
[02:52:02] <morenoh149> I tried {slug:1} and had the same error as now
[09:46:15] <pamp> I make a mongodump from mongo 2.6 collections, and then make a mongorestore. worked without problems, however does not show the collections that migrated
[09:47:11] <pamp> but in db.stats () appears the two collections that migrated, and the space they occupied. But if you show collections do not appear
[09:56:21] <Folkol> Is it possible to perform an incremental map-reduce with mutable documents? In particular, is there some convenient way of removing the old contribution to the reduced value when I "re-reduce" a mutated document anew?
[10:09:09] <Folkol> I have a collection with documents. I am doing a map/reduce over that collection, which takes quite a lot of time. I would like to do an incremental map/reduce instead, but I do not have a good idea of how to separate "already reduced documents" from "new ones" for each incremental step - since the documents can change.
[10:09:41] <Folkol> I can store some "latest modified" in the documents and use that, but then the same document will contribute more than one time to the reduced value.
[10:10:59] <Folkol> I could always store the old version of the documents in some special collection when I update them to handle this, but I do not know if this is a good strategy.
[10:55:19] <benji__> Hey all, I'm really struggling to get the aggregation framework to pull out required records, would anyone have time to give me a hand?
[10:59:32] <benji__> For example, I"m trying to get a group by-like query to work
[11:09:56] <kinlo> what exactly happens if you supply a replicaset in your connection string? Imagine I have a 3 node replica cluster, why do I need to supply the replicaset parameter in the connection string?
[11:15:15] <crowther> Can i search for a text string throughout a large document?
[11:29:41] <phutchins> When you have an app connecting to mongodb via a mongo config server, is the app connecting to that config server or is the config server simply telling the app which replica to connect to?
[12:13:07] <Tinu> any change for help with my unsharded collection ?
[12:13:21] <KekSi> i just wasted about 2 hours trying to find out why i couldn't mount my local data directory into a dockerized mongo - turns out because i'm using osx it doesn't support the FS handed through boot2docker :F
[12:13:22] <Tinu> Or rather sharded on three shards instead of all four
[12:23:42] <fl0w> Does anyone have a good tutorial style resource to get mongo up and running with replication and/or sharding? Something that’s a bit more introductory compared to the official docs preferably.
[12:31:50] <fl0w> Well, I’m willing to pay - that’s not the issue. I’m guessing I’ll learn more by “doing it myself”. Not that I think that my own setup would perform better but I just want to setup an replicated/sharded environment to get comfortable within it.
[12:32:10] <cheeser> i'm all for learning, for sure. that link will walk you through the cluster.
[12:33:01] <fl0w> Spun up a few linodes and figured I might as well try to break a sharded/replicated mongo. Aye, I’m on it as I type! :)
[12:44:46] <srimon> document structure is: {_id: 5, email:"sri@gmail.com", history:[{"id":5, email:"sri@gm.com"}, {"id":6,"email":"test@gmail.com"}], In history if id = 5 it needs to update email, if i send id = 7 need to insert
[12:45:08] <srimon> @cheeser: document structure is: {_id: 5, email:"sri@gmail.com", history:[{"id":5, email:"sri@gm.com"}, {"id":6,"email":"test@gmail.com"}], In history if id = 5 it needs to update email, if i send id = 7 need to insert
[12:50:45] <srimon> document structure is: {_id: 5, email:"sri@gmail.com", history:[{"id":5, email:"sri@gm.com"}, {"id":6,"email":"test@gmail.com"}], In history if id = 5 it needs to update email, if i send id = 7 need to insert
[12:50:59] <srimon> document structure is: {_id: 35, email:"sri@gmail.com", history:[{"id":5, email:"sri@gm.com"}, {"id":6,"email":"test@gmail.com"}], In history if id = 5 it needs to update email, if i send id = 7 need to insert
[13:07:06] <pluffsy> Are there any way to use json as input for extended data types (i.e. dates) with casbah. By default it does not seem to support dates in the strict format i.e “mydate”: {“$date”, “<date>”}. I get “bad key” on $date. As I get json as input I need to be able to parse strict dates somehow, right now I’m looking at making a regex, which feels hacky and nonoptimal. Any ideas on this?
[13:07:42] <Tinu> KekSi do you have any ideas? Why my balancer is not sending any data to my new shard?
[13:07:54] <teds5443> hi all, I'm having an issue where mongodb is segfaulting everytime I submit a map/reduce query. This only happens on fedora though, on ubuntu it works fine (same server version). I'm not sure if I should file a bug or not.
[13:08:26] <Tinu> its right now: mongos> sh.isBalancerRunning() false mongos> sh.getBalancerState() true
[13:10:53] <benji__> Anyone know how to only $push distinct array elements? Or maybe count distinct array elements?
[13:13:20] <kinlo> I've enabled auth and keyfile, I created my first user on my mongodb, but still I'm able to connect to the database without any authentication information. How do I prevent connections to the server witouth a valid username/pass combination?
[13:17:36] <Tinu> KekSi I know what was the problem
[13:17:46] <Tinu> there was balancer activeWindwo set up
[13:19:41] <kinlo> it does not, I connect on ip, not localhost
[13:19:58] <kinlo> note that authorisation works - I just want to prevent any connection at all
[13:20:47] <cheeser> so you can connect but you have to auth, yes?
[13:21:25] <kinlo> not really, I can still see information like the repset name and wether the database is primary or not - seems a bit too much information, I'd prefer not to have any information at all
[13:22:21] <cheeser> sounds like a roles thing maybe: http://docs.mongodb.org/manual/core/authorization/
[13:25:02] <kinlo> cheeser: as I read the documentation correctly, it seems you can only grant privileges, not revoke them so doubt it is a role thing. I cannot login on the database with any user/password combination unless it is correct, that's good. But when I just do not supply a username or password, I still can connect, which seems weird, coz I enabled auth
[14:34:42] <tsyd> What's weird is that this other document gets inserted fine: http://pastebin.com/ZncY2mBm
[14:39:31] <cheeser> which doc errors out? how are you importing?
[14:51:43] <NoOutlet> It seems like he's getting the default case on line 100 of this: https://github.com/mongodb/mongo-tools/blob/master/common/bsonutil/converter.go
[14:52:25] <NoOutlet> Though I don't know why the 767.0 would not match float.
[15:11:53] <soosfarm> hey, I'm running mongodb 2.6.8 with auth=true, I have added a siteAdmin user with these roles in the admin database 'userAdminAnyDatabase','readWriteAnyDatabase','dbAdmin','root'
[15:12:03] <soosfarm> but I can't connect to any other database with that user
[16:19:27] <benji__> HI all, can anyone comment on how I should be compare two string fields in a document? Currently working with { $match: { $eq: [ '$song_hotttnesss', '$artist_hotttnesss' ] } } which does not work
[16:35:34] <MatheusOl> Hi. When does fsyncLock actually block reads, and why?
[16:37:55] <pasichnyk> I added a couple priority:0,votes:0 nodes to my replicaset to speed up our regional reporting instances (one per region). I set my connection string to readPreference=nearest, but it doesn't look like any queries are making it to these boxes. Any step i'm missing here? Or can priority0 boxes wihtout any votes not get read queries?
[16:40:39] <pasichnyk> fyi, i set them to votes:0, as there are more offsite reporting replicas (5) than my onsite copies (3), and i didn't want a network split to take down my onsite cluster.
[16:49:08] <blaubarschbube> hi. i want to run mongodump for backup purposes. is there a way to only dump newest data instead of a whole collection?
[16:50:10] <pasichnyk> blaubarschbube, if you keep a timestamp of when you dumped last, you can limit the dump with query parameter (-q "{your query here}")
[16:51:29] <blaubarschbube> pasichnyk, thanks, hat should do the job. sry for asking stupid questions, i am just responsible for our backup, not for mongodb
[16:54:23] <pasichnyk> np, good luck. If you are on LVM volumes, you should also look into doing LVM snapshot backups instead of/in addition to mongodump. Especially if your database is huge.
[16:55:46] <pasichnyk> blaubarschbube, http://docs.mongodb.org/manual/tutorial/backup-with-filesystem-snapshots/ also look into MMS Backup as an option.
[17:20:02] <delinquentme> does nosql for some reason not support 'hierarchy' within data?
[17:20:13] <delinquentme> Im not wildly familiar with it, but to my understanding it handles it just fine...
[17:21:31] <cheeser> "nosql" is a meaningless marketing term used to refer to dozens of unrelated products and technologies who only fleeting resemblance is that most aren't traditional relational databases.
[17:26:18] <pasichnyk> delinquentme - so... mongodb is "nosql" and you can build heirarchy in data with it if you want. That what you're looking for?
[17:26:49] <pasichnyk> you would however likely build it at a document level, or heirachary between documents that you'd have to resolve yourself (no joins)
[17:27:11] <cheeser> those would be relations and not hierarchy, though.
[17:28:04] <pasichnyk> @cheeser sure. delinquentme - give me an example of what you're trying to do
[17:28:04] <delinquentme> Ohhh perhaps thats what this guy meant?
[17:28:20] <delinquentme> that joins need to be performed by the end user?
[17:29:07] <pasichnyk> delinquentme or you build your schema in such a way that you alreayd have all the data you need in a single query, and you don't have to "join" - think denormalized table.
[17:29:25] <delinquentme> pasichnyk, im not doing anything right now =] headed into a design meeting tomorrow and one of our guys said
[17:29:55] <delinquentme> "NoSQL, it is a viable option, given that the model data does not need to be hierarchical... The rest of the business data does need to be hierarchical so we cannot do away with Postgres. "
[17:30:35] <pasichnyk> ok, if your guy thinks its a good option based on the data model he has designed, then probably it. :)
[17:32:42] <pasichnyk> asking for clarification is not a bad thing. its all how you ask...
[17:34:03] <pasichnyk> there are ways to implement "hierarchy" in mongo document structures, i.e., http://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/#add-intra-document-hierarchy Maybe he's talking about something totally different though. Hard to know without context.
[17:36:40] <pasichnyk> @cheeser this is more what i was talking about building a hierarchy by referencing other documents: http://docs.mongodb.org/ecosystem/use-cases/category-hierarchy/
[17:39:40] <pasichnyk> @cheeser you see my question earlier about non-voting replicaset members and not seeming to take any read queries even though the readPreference is nearest, and there isn't another server for 1000 miles?
[17:40:35] <pasichnyk> curious if you have any thoughts, seems like these guys should be serving read requests, but doesnt' appear so (turned on profiling for all queries, and don't see anything besides my localhost monitoring queries)
[17:52:39] <pasichnyk> ah ha, i think there is another connection string that was hardcoded somewhere that didn't have the nearest flag. Ripping that out and hopefull it works. :)
[18:36:18] <culthero> Hello, would there be any reason in general as to why an aggregate pipeline query would run very very slow if the number of results was less then the limit? I have queries that return 102 results in a few millisections, and queries that have 97 results not return results for 30-40 seconds.
[18:36:57] <culthero> The query WAS working just fine, the server rebooted and now things aren't working, and I have no clue where to start looking.
[18:37:44] <GothAlice> culthero: There are a number of potential issues. Ref: http://docs.mongodb.org/manual/core/aggregation-pipeline-optimization/ for a list of some and http://pauldone.blogspot.ca/2014/03/mongoparallelaggregation.html for one approach to improve performance in the general case.
[18:38:26] <GothAlice> However, often a sudden shift in performance can be attributed to your query no longer using a, or the correct index. Not sure how to hint indexes in an aggregate, though.
[18:39:48] <GothAlice> Ah, on the last, ref: http://grokbase.com/t/gg/mongodb-user/131rg2zkbt/mongo-aggregate-query-doesnt-seem-to-be-using-the-index
[18:40:04] <culthero> I would imagine some kind of index went away, the collection I am aggregating has a TTL index, the script crashed, all the old records expired now only large results seem to return anything
[18:40:53] <GothAlice> Have you re-run your aggregate query's initial $match/$limit/$skip/$sort as a standard query with "explain"?
[18:41:10] <culthero> I am doing that now, I have to rewrite the query from mongoose debug to something CLI friendly. Standby
[18:42:42] <GothAlice> As meme-ish as "considered harmful" posts are these days, Mongoose deserves one. ¬_¬
[18:47:13] <culthero> yeah this is a project that has the benefit of being done like a year and a half ago, and just was abandoned
[18:48:06] <NoOutlet> Those are the best. I'm on something like that right now.
[18:50:55] <culthero> I don't even rememer mongo query syntax.. db.collection.aggregate ( [ { $obj }, {$obj } ] ) should work? Or do I need to wrap that in braces?
[18:52:08] <culthero> that gives me a syntax error, I don't see it
[18:53:41] <culthero> and here are my indexes; http://pastebin.com/tbZ9v0q5
[18:58:11] <NoOutlet> Well, your timestamps will need to be converted to something MongoDB can understand.
[18:59:50] <NoOutlet> And the explain() shortcut is only available on regular queries. Here's how to specify 'explain' on aggregate queries: http://docs.mongodb.org/manual/reference/method/db.collection.aggregate/#example-aggregate-method-explain-option
[19:01:56] <NoOutlet> Those are the syntax problems I see in your pipeline. There are also a couple things that are silly, but which might make sense when you are looking for more than one filter phrase or you are skipping some documents.
[19:03:03] <culthero> yeah I have removed the timestamps as that is a product of the script / mongoose and they should be irrelevant
[19:04:14] <NoOutlet> Basically, `phrases: { $in: ["anxiety"]}` is equivalent in returned results to `phrases: "anxiety"`.
[19:21:52] <GothAlice> {content: [{id: ObjectId(…), content: [{content: 'foo'}, {content: 'bar'}]}]} — amazingly, maintainable. I need to $elemMatch the nested "id", then positionally update the more deeply nested content. >:D
[19:22:06] <GothAlice> cheeser: Mad science is never stopping to ask: what's the worst that could happen? ;)
[19:25:35] <GothAlice> cheeser: http://cl.ly/image/3e3z2e1s3g080X3f262B sums up the rollercoaster of realizing the depth of the nesting. "Uh-oh." "Oh noes!" "I can make this work. >:3"
[19:26:27] <cheeser> i love that one so much. dude in in the middle is the Master of Schadenfreude.
[19:28:00] <cheeser> for the interested, asya posted the first of a perf comparison with 2.6/3.0 http://www.mongodb.com/blog/post/performance-testing-mongodb-30-part-1-throughput-improvements-measured-ycsb
[19:29:41] <cheeser> "Based on these results, the optimal thread count for the application is somewhere between 16 and 64 threads, depending on whether we favor latency or throughput. At 64 threads, latency still looks quite good: the 99th percentile for reads is less than 1ms, and the 99th percentile for writes is less than 4ms. Meanwhile, throughput is over 130,000 ops/sec."
[19:33:44] <GothAlice> Until it died, though, the workload I was loading it with was performing noticeably better. Then 1000x worse for a few seconds prior to the explosion. (On a 16 GiB RAM machine with a 300 MiB analytics dataset.)
[19:39:31] <GothAlice> cheeser: That post is also rather poignant in its omission of disk (which should benefit greatly from compression) and RAM utilization statistics at each of those threading levels. :/
[20:16:04] <cheeser> that's what multi-datacenter clusters do
[20:21:04] <GothAlice> fl0w: I have 27 TiB of data in GridFS, in a 3x3 sharded replica set configuration. (Yes, my per-shard size exceeds RAM. It's a write-heavy dataset, so it's rarely a problem.)
[20:21:57] <cheeser> time add a 4 and watch the balancer go ballistic.
[20:22:09] <GothAlice> dbclk: Yes, often it's a very good idea. What you're looking for is datacenter-aware replication: http://docs.mongodb.org/manual/data-center-awareness/
[20:22:54] <GothAlice> cheeser: I'm dreading the day I need to do that. I'd probably run a task to semi-manually balance at a substantial rate limit.
[20:23:42] <GothAlice> … not having my poor gigabit network get hosed? :P
[20:25:21] <GothAlice> Actually, a large migration event like that would probably cripple my current unbelievably reliable MTBF; considering my current drives, less the latest HDD in each array, are all > 3 years old, with 24/7 uptime, I'd likely lose one or more drives mid-balance.
[20:29:37] <fl0w> GothAlice: Correct me if I’m wrong here please but as I understand it - I can shard and effectively scale my HDD horizontally?
[20:34:19] <GothAlice> fl0w: Yes. The choice of sharding key allows you to have control over which shard your GridFS metadata and file chunks go to. Depending on your needs, you may wish to keep chunks relating to a single file together on the same shard, or spread them evenly amongst the shards.
[20:35:05] <GothAlice> (One could even intentionally make MongoDB _imbalanced_ in its sharding, i.e. if your shards don't all have the same amount of allocated disk space.)
[20:36:57] <dbclk> so currently guys what I have is 2 mongo instances with 1 arbiter
[20:37:13] <GothAlice> (Or if one of your shards is in a different datacenter. You can effectively set up geographically distributed data this way.))
[20:37:18] <fl0w> GothAlice: And minimum amount of shards is 3, correct? To get the redundancy that is?
[20:37:19] <dbclk> one of those instances is one a server and the other along with the arbiter is on another server
[20:38:02] <GothAlice> fl0w: Sharding does not give you redundancy. Replication gives you redundancy. This is why I described my setup as a "3x3" sharded replica set. (That's 3 shards, each containing three nodes in a replica set.)
[20:38:03] <dbclk> what i'm seeing if I should shudown the server with the 1 mongo instance and the arbiter
[20:38:11] <dbclk> the other server elects itself as secondary
[20:38:42] <GothAlice> dbclk: Never have your arbiter on the same physical node as a real data-serving mongod. To do so makes having the arbiter pointless 50% of the time.
[20:39:16] <fl0w> GothAlice: Hm. But if I lose one shard, isn’t there data integrity? Like a RAID 2 setting?
[20:39:21] <GothAlice> (I.e. if the node containing both mongod and the arbiter goes down, the remaining replica stops accepting reservations and packs up shop. ;)
[20:39:59] <fl0w> GothAlice: Oh. Well, back to the docs then.
[20:40:09] <GothAlice> fl0w: Think of sharding as "stripe" RAID, and replication as "mirroring" RAID. To achieve both performance _and_ reliability, use both. :)
[20:42:06] <GothAlice> dbclk: And if the remaining nodes are unable to see >50% of the known nodes (i.e. if only one replica remains after the other replica and the arbiter go down), it goes read-only.
[20:42:06] <dbclk> but, a node specified as secondary would it prevent data from being provided from that node?
[20:42:25] <GothAlice> (Read-only. Writes will fail, but reads should still work.)
[20:43:11] <GothAlice> OTOH, with the arbiter on, say, the application node (not sharing space with a data node), the replica set can recover, the remaining secondary elects itself as primary with the help of the arbiter, and your application continues, blissfully unaware that hell just broke loose somewhere. ;)
[20:43:40] <GothAlice> (High-availability is cool, like bow-ties.)
[20:45:35] <GothAlice> dbclk: If you learn well from code, https://gist.github.com/amcgregor/c33da0d76350f7018875#file-cluster-sh-L96-L114 is a little script of mine to start up a 2x3 sharded replica set with authentication. The highlighted code spawns the pool of processes, then 120-124 configures sharding. (This script is for testing and demonstration purposes, thus all daemons on one host.)
[20:46:08] <GothAlice> This script could be useful to let you easily, quickly, and without permanent effect test out different sharding strategies to see how things get balanced.
[20:47:17] <GothAlice> (Just adjust the variables at the top to taste. It's safe to nuke the contents of the data directory between runs, it'll get recreated, just remember to issue a stop command first. ;)
[21:07:23] <NoOutlet> I have one of those too, Alice.
[21:20:22] <NoOutlet> When you say "armouring code", do you mean the port checking?
[21:21:05] <GothAlice> Aye. I have some similar code in my SSH tunneling code to check a single port on the remote side, but you're conveniently checking everything at once.
[21:22:55] <GothAlice> NoOutlet: https://gist.github.com/amcgregor/3983451 is my 'tunnel' script, which I think you might enjoy. (This one's a fair bit older than the sharding one. ;)
[21:24:40] <GothAlice> (Executed as: "tunnel <local-port>" or "tunnel <local-host> <local-port>"
[21:35:15] <NoOutlet> Nice Alice. I don't know much about tunneling but I like the practice of echoing the actual command that is executing.
[21:45:23] <epx998> in mongo 2.4, can i have a replica set and a slave? or is it one or the other? or one in the same?
[21:45:26] <GothAlice> NoOutlet: I mainly install this on my client's machines, so I can tell them to run support.script (shell script, effectively, but a launchable icon) to tunnel VNC/5900 to one of my servers for support, and they just need to read out the number presented.
[21:45:34] <GothAlice> NoOutlet: Or, likewise, I use this to tunnel VNC from my machine to a public server (w/ HTML5 viewer) for webinars instead of the various cloud offerings. (Free is good… not installing extra software is even better. ;)
[21:45:51] <GothAlice> epx998: One or the other, and "master/slave" configurations are now fully deprecated. As is 2.4. ;)
[21:48:43] <epx998> we bought some company that runs 2.2 or 2.4 master/replica and i need to migrate it to a different datacenter
[21:53:00] <epx998> guess what they want is for me to add in a replica to the master at a new dc, have it sync then, shutdown the service, reconfgi it has a master and bring it up
[21:54:44] <GothAlice> epx998: https://jira.mongodb.org/browse/SERVER-17278?filter=17502 < these are the improvements, large-scale changes, bug and security fixes released within the 2.6 lifespan. You're 2054 issues behind… being behind. There are an additional 1675 resolved issues leading up to the current 3.0.0 release, which so far, I'm loving. (Ref: https://jira.mongodb.org/browse/SERVER-17444?filter=17503)
[21:55:10] <fl0w> I’m having a real hard time understanding what WiredTiger actually is, what problems it solves, and how it’s different from the standard engine - any recommendations on resources aimed towards rookies?
[21:55:47] <GothAlice> Oop, forgot to filter to resolved status. D'oh. *updates*
[21:56:57] <GothAlice> fl0w: It uses a combination of techniques, taking more responsibility for management away from the kernel so as to apply more intelligent handling to IO, for example, and optionally using a more modern B-Tree standard that is more efficient in order to give a wide variety of advantages.
[21:58:54] <fl0w> GothAlice: Ok. But is it supposed to be a replacement or a suppliment? If the latter is the case, under what circumstance is WiredTiger prefered?
[21:59:31] <NoOutlet> It's an option for storage engine.
[22:00:02] <GothAlice> fl0w: Lock-free algorithms are awesome. (Conditional upserts in RAM.) It better supports SSD backing stores by supporting no-overwrite storage, etc. For details, see: http://source.wiredtiger.com/2.5.1/index.html (notably the Architecture section)
[22:01:29] <GothAlice> As a MongoDB back-end it might not be able to do some of these things… yet. (This is documentation for the underlying engine that was bought by 10gen… I know not of its relevance to the WiredTiger implementation in the MongoDB code.) See also http://docs.mongodb.org/v3.0/release-notes/3.0/#major-changes
[22:04:40] <NoOutlet> The idea is that there can be other storage engines developed by storage engine specialists and now those storage engines can be used with MongoDB. WiredTiger is the first official alternative to the mmapv1.
[22:05:30] <fl0w> Ah, wiredtiger.com had what I wanted :) Though a bit to technical at this stage for me. Either way, the one thing I’m really lacking is transactions - but I guess that it defies mongodb conceptually?
[22:06:44] <GothAlice> fl0w: WiredTiger supports compression, both fast (like, ludicrous-speed fast) but good-enough compression, and zlib compression. It supports finer-grained locking, which greatly reduces read and write latencies, ref: http://www.mongodb.com/blog/post/performance-testing-mongodb-30-part-1-throughput-improvements-measured-ycsb (it's very quick) For more polished overview: http://www.mongodb.com/mongodb-3.0#overview
[22:07:18] <fl0w> yea I’m reading that blog post as we speak!
[22:08:11] <fl0w> the throughput really does go … well, through the roof.
[22:08:38] <GothAlice> (Anywhere you see 2.8, read 3.0.)
[22:10:02] <GothAlice> Buffering… fractal tree… lazy queue… things… of pure magic. (You'll have to look through their blog for mmapv1—MongoDB 2.6—benchmarks to compare against the WiredTiger benchmark previously linked.)
[22:26:39] <Freman> then I can print it out and use it to beat the guy responsible
[22:27:00] <GothAlice> http://qnalist.com/questions/5276891/mongod-locks-up-on-btrfs problem report involving lockups… Exocortex is still digging. (It's been a while since btrfs has come up…)
[22:30:07] <GothAlice> It's not exactly the density of a phone-book, but it's the fourth major point.
[22:31:09] <GothAlice> Freman: However, your journal is currently borked. You may be able to rename the on-disk journal files out of the way and attempt to continue with the data as-is. This may require running mongod once with --repair to rebuild the collections.
[22:31:21] <Freman> I'm wondering if btrfs+mongo is contributing to other issues we have (especially speed)
[22:31:24] <GothAlice> (Re-naming it out after shutting down will recreate on startup.)
[22:31:40] <GothAlice> Freman: It certainly would. It'd also heavily contribute to disk wear, esp. on SSDs.
[22:32:45] <Freman> so, mv journel journel.borked, /usr/local/bin/mongod --config /var/db/mongodb.conf --repair
[22:32:58] <GothAlice> And attempt to go from there, aye.
[22:34:20] <GothAlice> To rebuild, you'll want 1.2x your dataSize in free disk space, BTW. (It literally rebuilds into a copy, validates, then moves the copy over the original, so much space is needed.)
[22:34:48] <Freman> 2015-03-18T08:34:18.162+1000 I - [initandlisten] Fatal assertion 18506 OutOfDiskSpace Cannot repair database logsearch having size: 1405865689088 (bytes) because free disk space is: 108800696320 (bytes)
[22:35:23] <fl0w> im out - started the university .. so i guess beer boobs and sleepless nights are in order, night ya’ll and thanks for all the help and suggestions!
[22:35:42] <GothAlice> fl0w: Have a great one. It never hurts to help.
[22:37:33] <GothAlice> Freman: Heh, we only keep the last week's of logs in MongoDB, and cherry-pick interesting records for archival elsewhere. Nuking logs is less of a problem for us.
[22:37:48] <GothAlice> (Also, capped collections are awesome.)
[22:38:34] <Freman> if your crap broke 501 executions ago then it's your fault for not looking sooner
[22:41:25] <GothAlice> Freman: ^_^ It's really interesting to see other strategies. In our nearly two-year-old DB at work our op counters read only 8 deletions not related to TTL indexes. We… keep interesting data forever, and let the uninteresting stuff die of natural causes. And every in-app delete button is secured with "is_alice" ACLs. ¬_¬
[22:47:37] <GothAlice> Freman: Also, apologies for being the bearer of bad news on the btrfs thing. :/ It's a neat project, but I've seen it bite others in the past.
[22:53:36] <GothAlice> https://github.com/zfsonlinux/zfs/issues/326 is a bummer for the Linux implementation, though.
[22:54:44] <Freman> so this is the part of mongo I've never done before, adding a user
[23:00:32] <GothAlice> collMod, compact, enableProfiler, indexStats, reIndex, repairDatabase, and validate, amongst others are dbAdmin/dbOwner, but dbOwner also gives user management roles on the target namespaces.
[23:00:36] <Freman> I have a couple of bugs in my zfs install (I allocated too much space to cache/log disk) and I need module upgrades...
[23:03:34] <Freman> I don't have an indexStats role
[23:03:57] <GothAlice> Those are "permissions" to commands given out by roles.
[23:04:30] <GothAlice> dbAdmin is one role that would grant that permission, as well as access to the "reIndex" command.
[23:16:19] <Freman> yay, mved old 1.4tb db over, create new one, add roles, make go, re-create indexes (because my particular app is dumb and doesn't... will add that in next release)
[23:16:37] <GothAlice> Freman: Be careful having indexes be automatically created.
[23:16:42] <Freman> best bit... if no-one asks for this old log data in the next few days we get to throw it out and prove my point about general purpose loging
[23:17:12] <GothAlice> If run in foreground, they'll block the collection. If run in the background, they could take a very long time to be active. (Scheduling index creation maintenance windows can be useful.)
[23:18:33] <Freman> meh, I was just going to make it go "oh, first record? add indexes"
[23:21:33] <GothAlice> There is overhead in an ensureIndex/createIndex call…
[23:26:04] <GothAlice> Oh, that explains a few things. mms-monitoring-agent can't find libsasl2.so.2 — I've got libsasl2.so.3.
[23:26:47] <GothAlice> Fixed and pinned. Silly upgrades.
[23:45:51] <Freman> I literally.... this morning I woke up, convinced it was friday... that should be reason enough to stay home on a wednesday
[23:46:56] <GothAlice> I haven't slept since the few hours I caught Sunday night.
[23:48:01] <GothAlice> I just wish the MMS interface didn't keep crashing the browser tab. :/
[23:48:08] <GothAlice> That dashboard chart view is intense.