PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 27th of October, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:06:45] <Forest> Boomtime: it works,thanks again and have a nice day :) Its late night here,i am gonna sleep. I am so happy i finally made it,you have no idea how happy i am. Goodbye,friend :)
[00:22:36] <llakey_> what are some workarounds or fixes to this error: BufBuilder attempted to grow past the 64MB limit
[00:23:42] <Boomtime> llakey_: where do you see this error?
[00:23:57] <llakey_> in the mongo logs
[00:24:11] <llakey_> Sun Oct 26 03:26:11.118 [conn363824] Assertion: 13548:BufBuilder attempted to grow() to 1073741824 bytes, past the 64MB limit.
[00:24:42] <Forest> Boomtime: I just wondered, can you please explain me the mongodb document limit? Because i am splitting the big arrays into 15 MB chunks in order to get it into mongo,but as you said before my document are 100 Kb,so i can send bigger arrazs into insert function?
[00:25:22] <Boomtime> llakey_: what is that client attempting to do at the time? I have not seen that error before
[00:26:20] <joannac> llakey_: Also, what version of mongodb?
[00:26:31] <llakey_> joannac: 2.4.8
[00:26:37] <llakey_> Boomtime: looking
[00:26:45] <Boomtime> llakey_: nevermind, I found it: https://jira.mongodb.org/browse/SERVER-13580
[00:27:01] <Boomtime> the problem is the aggregation pipeline you are running produces intermediate results that are too large
[00:27:40] <Boomtime> that server ticket refers to making the error message more appropriate/readable
[00:28:24] <llakey_> Boomtime: what options do i have to avoid or workaround this limit?
[00:28:25] <Boomtime> Forest: mongodb has a 16MB per document limit, that is all - is there something more you need to know?
[00:29:28] <Forest> Boomtime: per document,but i am sending there 15 MB of documents consisting of many items around 100 KB. So the 16 MB is the upper limit also for this operation?
[00:29:33] <Boomtime> llakey_: re-evaluate your aggregation pipeline - if you have sort stages (with a large input) that aren't indexed then you're in trouble, etc
[00:30:06] <Boomtime> Forest: no, those will be encoded appropriately by the driver
[00:30:41] <Forest> Boomtime: Because whenever i tried to insert more than 16 MB array it didnt isnert all of the documents.
[00:30:50] <cheeser> http://docs.mongodb.org/manual/reference/command/aggregate/#dbcmd.aggregate
[00:30:55] <cheeser> look at allowDiskUse
[00:31:04] <Boomtime> Forest: what error did you get?
[00:31:33] <Boomtime> llakey_: cheeser comment was for you (allowDiskUse is another way around the problem)
[00:33:09] <joannac> llakey_: I would try and limit the size of stages, rather than use allowDiskUse
[00:33:39] <joannac> you don't want to be writing to disk all the time if you don't need to
[00:35:18] <llakey_> Boomtime: joannac: cool. thanks
[00:37:50] <Forest> Boomtime: strange,it seems to be working,its just slower if i insert 60 MB arrays instead of 15
[00:39:59] <Boomtime> it should certainly take longer to insert 60MB of data than 15MB.. so if you insert in a loop, each loop will take four times longer
[00:41:18] <Boomtime> also, what writeConcern are you using? (http://docs.mongodb.org/manual/core/write-concern/)
[01:17:32] <lpghatguy> Is setting 'required' for a document's property redundant if it already has 'default' set?
[01:17:56] <lpghatguy> (this is in the context of Mongoose)
[08:01:17] <agend_> hi - have a question about setting up mongo sharded cluster - in docs it says: 'To avoid downtime, give each config server a logical DNS name' - does it mean I need to set up my own dns server - or is it enough to edit /etc/hosts?
[08:15:17] <bin> guys what could the reason storing the db with 2 hours behind
[08:15:36] <bin> i mean my timezone is +2 but on the mongodb is like +0 .. so should i set the db time or something ?
[08:15:44] <bin> the application is sending the right time ..
[10:07:47] <chovy_> how do i tell debian to start mongo with --auth?
[10:20:08] <chovy> i'm confused about using auth. i have an open db right now, and i want to setup an admin user (global) and then for each db, setup its own admin user for just that database.
[10:20:42] <chovy> i uncommneted the 'auth = true' in /etc/mongo.confg and restarted, but i can still connect to all databases w/o authenticating.
[11:21:21] <agend_> just to be sure - does aggregation framework works on shards? - i mean if i want to sum data in 2 different sharks mongo wouldn't get lost ???
[11:21:55] <agend_> yeah - not sharks but shards :)
[12:09:39] <drag0nius> does mongodb have config verification tool?
[12:38:41] <blaubarschbube> hi. i have a mongodb server running. now i want to add a second node on a second server. are the specs (ram, cpu, whatever) of this new server mandatory or does the load balancer handle this?
[12:47:10] <drag0nius> what exactly is mongodb's keyFile format?
[13:05:47] <deathknight> I have a thing that takes mongo data and streams it to a relational database...where are some large mongo datasets I can use to test the thing?
[13:07:31] <cheeser> you could import the enron corpus...
[13:07:40] <cheeser> http://mongodb-enron-email.s3-website-us-east-1.amazonaws.com/
[13:11:46] <deathknight> cheeser: Thanks! Already went through that. Looking for more...super hungry. Plus the long strings in that dataset breaks HP vertica... :'(
[14:16:25] <GothAlice> agend_: Yes, aggregation works with sharding. Perform the aggregate query on the balancer, it'll work out which shards to actually query to get your data.
[14:17:03] <GothAlice> blaubarschbube: While you don't need to keep shard hardware identical, it's a good idea to keep them as similar as possible to avoid strange performance issues later.
[14:17:25] <agend_> GothAlice: thanks
[14:17:35] <GothAlice> Oh, good, that wasn't a 24h delayed scrollback buffer. XD
[14:19:04] <GothAlice> chovy: Authentication is ignored on localhost until you add an admin user. Step 1 is enable auth, step 2 is add global admin, step 3 is use that global admin to create per-database users.
[14:19:11] <agend_> GothAlice: but balancer does chunks splitting + migration - u have meant perform query on mongos - right?
[14:19:21] <GothAlice> agend_: Yes.
[14:19:29] <GothAlice> mongos is your friend here.
[14:19:45] <GothAlice> chovy: See: http://docs.mongodb.org/manual/tutorial/enable-authentication/
[14:23:06] <GothAlice> drag0nius: Run mongod -f /path/to/conf — if it works, it works. It it didn't, log output might explain why.
[14:23:30] <drag0nius> i just thought there is some easier way
[14:23:39] <GothAlice> drag0nius: Configuration "validation" is surprisingly difficult. You can have a syntactically correct configuration file that's semantically jibberish for the runtime.
[14:24:23] <GothAlice> (So you end up having to run it through the real mongod process anyway to make sure it behaves during runtime as you expect according to the configuration.)
[14:31:07] <blaubarschbube> thanks GothAlice
[14:33:33] <GothAlice> ^_^;
[14:34:14] <odin_> why does mongodb chew up so much CPU time when idle ?
[14:34:53] <GothAlice> odin_: It has management tasks; you can run db.currentOp(true) to see what is taking most of the time. http://docs.mongodb.org/manual/reference/method/db.currentOp/
[14:35:13] <GothAlice> odin_: For example, once per minute it runs through all timed expiry indexes and culls records.
[14:35:25] <odin_> but there is no data in the database (yet)
[14:35:30] <GothAlice> (Which is where most of my idle CPU goes.)
[14:35:40] <GothAlice> Hmm; currentOp() will show you what it *thinks* it's doing.
[14:35:46] <odin_> so I would expect timer lookup to take < 0.0001 seconds and that task terminate
[14:37:27] <odin_> I see ticket#10696 (closed: invalid) seem to be try of my version: select(11, [9 10], NULL, NULL, {0, 10000}) = 0 (Timeout) ... this is a 100Hz timer / context switch ?
[14:37:34] <odin_> s/try/true/
[14:38:02] <GothAlice> I'm not sure what that line represents. C?
[14:38:38] <odin_> strace output (for system call tracing), I was verifying the top google hit for "mongodb cpu usage idle"
[14:39:24] <agend_> GothAlice: would u mind taking a look at my previous question: "about setting up mongo sharded cluster - in docs it says: 'To avoid downtime, give each config server a logical DNS name' - does it mean I need to set up my own dns server - or is it enough to edit /etc/hosts?"
[14:39:57] <odin_> what need attention at the rate of 100Hz ? surely timer events know the length of time to the next expiry, and can use a longer select() timeout value accordingly
[14:40:05] <GothAlice> agend_: /etc/hosts can be enough. I've got a management system which hands out real DNS names, but to avoid propagation delays it also configures /etc/hosts across the cluster.
[14:40:38] <GothAlice> odin_: Nope; since that "next expiry" would have to be constantly managed.
[14:40:53] <tscanausa> agend_: /etc/hosts should be fine
[14:41:02] <GothAlice> odin_: Insert a new record, it can change. Run a multi-document update, it can change in mathematically-derived ways.
[14:41:03] <odin_> yes this is normal for absolute/relative timer handling
[14:41:56] <odin_> timer handling is a well solved problem, looks like a naive implementation if it re-checks all timers at 100Hz
[14:42:11] <agend_> GothAlice: so how does it work - when mongo tries to find some machine by name - and ask dns first - does it cache the ip address at all? does it ask dns every single time? does it hold ip cached for some time?
[14:42:59] <tscanausa> agend_: its a system call that the os handles
[14:43:05] <GothAlice> agend_: gethostbyname() — on UN*X-like machines that evaluates /etc/resolv.conf, checks /etc/hosts, then calls out to DNS as configured in resolv.conf.
[14:43:36] <GothAlice> (Which may include a search path, i.e. a lookup for rs1s4 can be transformed into rs1s4.db.example.com
[14:44:09] <odin_> FWIW next expiry is already being constantly managed, at a rate of 100Hz, instead of only when a timeout changes (such as adding/removing timers)
[14:44:39] <odin_> usually it can be very efficient to check if the timer being changes now, impacts the current "next expires" value, since most timers probably expire later, most of the time no disruption occurs
[14:44:52] <agend_> GothAlice: ok, I have a problem with nginx and docker container - after container restart it gets a new ip address, docker is responsible for updating /etc/hosts for nginx - but nginx seems to not bother about the change and use old ip
[14:45:33] <GothAlice> odin_: Because across a balanced cluster, you can't reliably *track* the next timer in the way you suggest. I've built several of these types of timer systems (notably in some more complex coroutine trampolines; now I just use apscheduler) and that type of tracking is not fun. Works for discrete scheduled tasks, not so much for "any record in the database may expire".
[14:45:59] <GothAlice> (Across a balanced cluster.)
[14:46:03] <odin_> 18 seconds of CPU time used in 1 hours of wall clock time
[14:46:07] <agend_> GothAlice: i guess nginx might do some ip caching - or i should check gethostbyname() docs
[14:46:23] <agend_> tscanausa: thanks
[14:46:31] <odin_> this isn't the main problem, the /proc/6102/status shows 300k context switches
[14:47:09] <odin_> since the code that runs for 18 seconds is probably faster than the cost the kernel uses to switch task there and back
[14:47:37] <odin_> but there is no cluster enabled, so there must be an efficiency saving to be had there
[14:48:34] <GothAlice> odin_: For comparison, the primary in my production cluster has 196 hours of wall time against 28 days of uptime (Rackspace needed to reboot for some dom0 maintenance at the beginning of the month…) and 62,937 mandatory context switches, and 244,764,764 voluntary context switches.
[14:50:11] <odin_> I have 300k voluntary within 1 hour of use, there is no data in database, its just a fresh install
[14:50:19] <GothAlice> That's weird.
[14:50:26] <odin_> now your machine might be dedicated to mongodb usage, so might not context switch
[14:50:39] <GothAlice> It certainly is; the DB cluster runs pure.
[14:50:42] <odin_> but my system the CPUs are always busy doing something else
[14:51:50] <odin_> well on a multicore system, dedicated to mongodb usage, I would expect much lower context switches, since much of the timer there is another core to use for a cron job or whatever other activity
[14:52:24] <GothAlice> I have processor affinitiy carefully tuned, too, to ensure as few conflicts as possible.
[14:52:50] <odin_> even so, you get automatic CPU affinity effects (if you system is dedicated for mongodb)
[14:53:11] <odin_> as there is little else doing on, on that system, the kernel will automatically try to maintain affinity
[14:54:43] <odin_> setup a 48hours math 4 core computation (on a 4 core machine), on a system with an idle mongodb, then stop mongodb, see how much more work you can now do
[14:55:52] <GothAlice> I'm not one to mix and match my server services like that. It's an unwise idea normally, doubly unwise with any form of database server. (Where you want the kernel caches primed, etc., etc.)
[14:56:15] <GothAlice> It's triply important with MongoDB, considering the way it uses memory mapped files.
[14:56:27] <odin_> sure if you are trying to squeeze the last 10% of performance (and have budget to do so)
[14:57:38] <odin_> but much of the real world (that I live in) isn't like that, I can see why ticket#10696 complains as his development system a laptop eat battery with mongodb started
[15:01:21] <GothAlice> odin_: As an interesting note, I regularly run 10 mongo processes (replicated sharded setup) for testing read preferences and correct sharding behaviour for data in development. My battery still lasts 8 hours. :/
[15:02:29] <GothAlice> odin_: https://gist.github.com/amcgregor/c33da0d76350f7018875 (note the smallfiles/oplogSize/chunkSize tuning for reduced impact.)
[15:14:15] <shoerain> Can I do something like db.collection.find(...).delete() ? This seems a bit different: http://docs.mongodb.org/manual/reference/command/delete/
[15:15:05] <GothAlice> shoerain: "The remove methods provided by the MongoDB drivers use this command internally." — your driver likely does provide the convienence method on a cursor that you expect.
[15:16:11] <nscavell> where's the best place to ask for a possible feature/enhancement for the mongo java driver (3.x) ?
[15:16:23] <nscavell> https://groups.google.com/forum/#!newtopic/mongodb-user ?
[15:16:33] <GothAlice> nscavell: Likely on the ticket management system for that project, or there.
[15:18:07] <nscavell> GothAlice, thx, jira may indeed be better
[15:29:42] <shoerain> GothAlice: what about in the mongo shell? Do I have to do `db.collection.remove(criteria)? Also is there a builder pattern way to construct queries in mongo? 'mongodb shell builder pattern' doesn't return anything interesting, but it would be nifty to query like so: http://api.mongodb.org/java/2.2/com/mongodb/QueryBuilder.html
[15:31:36] <GothAlice> shoerain: Individual drivers and third-party helper libraries offer different ways of constructing queries. MongoEngine, for example, performs flat-to-nested mapping to allow rich querying with just function arguments (i.e. .find(foo__bar__gt=27)) but also provides a method for combining query elements (Q objects) using overloaded comparators. (I.e. Q(foo__bar__exists=0) | Q(foo__bar=None))
[15:32:28] <GothAlice> shoerain: And yes, in the shell db.collection.remove(query) is the correct form.
[15:39:32] <shoerain> Heh, MongoEngine looks a lot like Django's ORM
[15:39:39] <GothAlice> There's a reason for that. :)
[15:40:05] <GothAlice> Except, notably, MongoEngine doesn't go completely batsh*t on simple query generation. ;^)
[15:40:54] <GothAlice> (Our favourite Django-generated query at work printed across fourteen pages at 12pt.)
[15:41:37] <shoerain> evidently, i'm not familiar enough with Django's ORM to know that, but sounds interesting. What's an example? Is it just related to certain queries in SQL being needlessly long?
[15:42:21] <shoerain> I think plucking multiple fields from a JOIN, but I don't remember
[15:42:34] <GothAlice> It includes explicit naming of every field involved in the query, at every level of join. This makes them verbose. It also seems to have an affinity for producing overly convoluted joins. (I.e. pairing data it doesn't need to, adding superflous joins that don't effect the final results, etc.)
[15:43:51] <GothAlice> shoerain: http://cl.ly/2Q121s2S2d1E/model.pdf — we didn't really help matters, though.
[15:44:56] <shoerain> fun model
[15:45:01] <GothAlice> Yeeeah…
[15:45:05] <GothAlice> Not so much. XD
[15:45:23] <GothAlice> Luckily that got thrown out and I've been rebuilding on MongoDB. It's working a lot better now. ^_^
[15:45:39] <shoerain> is that from graphviz? I definitely don't do much DB stuff, did you build that against an existing table schema?
[15:45:45] <shoerain> s/build/run a program/
[15:47:01] <GothAlice> shoerain: That model had been building for three years prior to my being hired. Had I seen that before I was hired… and yeah, graphviz, with the .dot generated automatically from the models. (I also have this call graph making use of the data: http://cl.ly/011i1W2R0r1P/match-new.pdf )
[15:47:57] <GothAlice> This is an example of how not to code iterative statistical evaluation. ;)
[15:51:41] <GothAlice> This is a problem. :|
[16:32:30] <darkblue_b> hi agend_ - did you get an answer about DNS ?
[16:32:50] <darkblue_b> .. I also wondered about using only host alias'
[16:33:23] <agend_> darkblue_b: yeah - /etc/hosts should work fine
[16:33:51] <darkblue_b> .. thats what my colleague thought also
[16:34:02] <GothAlice> Since MongoDB would be resolving the addresses using gethostbyname(), /etc/hosts will have an opportunity to supply an answer before DNS is invoked. (Thus it'll work fine; coordination of updates to /etc/hosts across a cluster becomes a thing, though.)
[16:34:22] <darkblue_b> ah nice
[16:34:47] <GothAlice> Anyone with 30 seconds willing to give my test Python script there a run to see what happens?
[16:35:09] <agend_> GothAlice: what deps?
[16:35:16] <GothAlice> agend_: pymongo
[16:35:18] <GothAlice> That's it.
[16:35:32] <agend_> GothAlice: so mongo as well
[16:35:46] <GothAlice> agend_: ofc, but nothing fancy needed on that side.
[16:36:13] <agend_> GothAlice: acutally dont have mongo on my desktop - always put them in container
[16:36:34] <agend_> GothAlice: ok - give me script - i'll give it a try
[16:36:49] <GothAlice> agend_: It's in the ticket. https://github.com/mongodb/mongo-python-driver/blob/master/test/test_cursor.py#L117-L123 is the "working" test case.
[16:37:21] <GothAlice> (and why I'm so confused)
[16:38:18] <agend_> GothAlice so I have to clone the repo first?
[16:38:30] <GothAlice> agend_: Nopenopenope. My test script there is self-contained.
[16:38:45] <darkblue_b> so.. this is the part that leads to the situation .. await_data=True).max_time_ms(2 * 1000)
[16:38:48] <GothAlice> if it prints a list with one element, it's failed. If it explodes with a ExecutionTimeout exception, it's passed.
[16:39:04] <GothAlice> darkblue_b: The .max_time_ms bit, aye.
[16:39:09] <agend_> GothAlice: just tell me what to do step after step
[16:39:32] <darkblue_b> [{u'_id': ObjectId('544e74c003f5981eff6e75b7'), u'nop': True}]
[16:39:42] <GothAlice> agend_: Copy and paste the code block from the ticket into a file. In a terminal, make sure you're somewhere pymongo is installed, then just run python path/to/script.py
[16:39:53] <darkblue_b> mongodb 2.6x on Linux..
[16:39:57] <GothAlice> darkblue_b: Yeah, that's a problem. My example matches the test case pretty exactly, and it doesn't behave the same.
[16:40:08] <GothAlice> *mind blown*
[16:40:35] <agend_> GothAlice: u mean the test_cursor.py ?
[16:40:45] <agend_> GothAlice: whole file - right?
[16:40:59] <GothAlice> agend_: No, my case revolves around this particular usage (and test case): https://github.com/mongodb/mongo-python-driver/blob/master/test/test_cursor.py#L117-L123
[16:41:19] <GothAlice> Or the one previous in the file: https://github.com/mongodb/mongo-python-driver/blob/master/test/test_cursor.py#L85-L91
[16:41:55] <darkblue_b> so I take it, you use mongoengine and pymongo both
[16:42:01] <agend_> GothAlice: script with just 117-123 lines would not run
[16:42:13] <darkblue_b> (I grabbed pymongo on install here)
[16:42:42] <agend_> GothAlice: which mongo version do i need?
[16:42:48] <GothAlice> agend_: Indeed, it wouldn't. You'd be missing self.fail and a bunch of other variable references. You'd want to run the test suite and just limit it to executing those.
[16:42:51] <GothAlice> Any 2.6+.
[16:42:51] <agend_> which pymongo?
[16:43:06] <GothAlice> Any pymongo 2.7+
[16:44:54] <agend_> GothAlice: so u want me to download all file test_cursor.py' - but i can see there some imports from test.test_client etc - so i need to clone whole repo
[16:45:10] <GothAlice> agend_: No. Not at all. Stop it. ;)
[16:45:37] <GothAlice> pip install pymongo; python path/to/my/sample.py — that should be it, if you've got MongoDB already running locally.
[16:45:42] <agend_> GothAlice: what is: from test.utils import is_mongos, get_command_line, server_started_with_auth
[16:45:48] <GothAlice> agend_: No.
[16:46:02] <agend_> GothAlice: what is sample.py?
[16:46:04] <GothAlice> agend_: https://jira.mongodb.org/browse/PYTHON-780
[16:46:09] <GothAlice> ^ That contains sample.py
[16:46:28] <GothAlice> (Note the block of code. That reduces the test case to the bare minimum without other dependencies.)
[16:46:39] <agend_> GothAlice: now u talking ;)
[16:47:35] <GothAlice> darkblue_b: And yes, I use both pretty heavily. High-efficiency stuff I do in raw pymongo. Full disclosure: I also contribute to MongoEngine development. (Notably .scalar() and a substantial update to signal support and documentation.)
[16:50:59] <cheeser> so you work with jesse a lot then.
[16:51:30] <GothAlice> XD
[16:52:06] <GothAlice> Not sure how to phrase this: less than I'd like to, but I like not having to work with him even more, if you catch my drift. It's rare that I encounter a problem I can't explain or work around. ^_^;
[16:52:33] <cheeser> being able to scratch your own itches++
[17:08:55] <agend_> GothAlice: [{u'_id': ObjectId('544e7bf86c1288006fccdc25'), u'nop': True}]
[17:09:07] <GothAlice> agend_: Yup. Thank you for confirming. :)
[17:09:10] <agend_> GothAlice: are we happy?
[17:09:19] <GothAlice> agend_: In a perfect world that should have exploded gloriously instead of returning.
[17:10:44] <agend_> GothAlice: dont know too much about capped collections, never used them
[17:11:54] <GothAlice> For example, every single incoming request gets recorded into one for diagnostics and replay in development.
[17:12:21] <GothAlice> (Along with the matching response.)
[17:12:33] <agend_> GothAlice: r u a girl?
[17:12:47] <GothAlice> agend_: This is the internet. I'm an AI. ;)
[17:13:02] <agend_> GothAlice: i new u r a girl :)
[17:13:32] <cheeser> please knock that off.
[17:47:00] <agend_> GothAlice: so it was all about checking if this world is perfect?
[17:53:07] <GothAlice> Technically it was confirming that await_data causes mongod to ignore max_time_ms.
[18:28:11] <cheeser> i think await_data implies you don't care how long it takes...
[18:28:16] <cheeser> that's the point of a tailable cursor
[18:31:04] <GothAlice> I'd expect await_data to be restricted by maxTimeMS.
[18:31:11] <agend_> GothAlice: just kidding
[18:31:48] <GothAlice> https://jira.mongodb.org/browse/SERVER-2212 doesn't mention anything specific about them at all. ;)
[19:26:36] <easytyping> Can anybody recommend a NoSQL database for a restaurant ordering app that has Users, Orders, Reviews?
[19:26:41] <easytyping> I feel like I would be comfortable implementing this in a traditional RDMS, but I'm curious to see if there's a similar NoSQL alternative; I want to jump on the bandwagon.
[19:28:01] <cheeser> you should use mongodb
[19:29:16] <easytyping> i was thinking of either mongodb or redis, but i didn't know enough about their best use cases
[19:29:32] <easytyping> i'm not familiar with denormalizing my data and thinking in the "nosql" way if there is one
[19:29:41] <easytyping> i've read mongodb is good for documents
[19:29:59] <easytyping> could that fit a simple restaurant app with the simple relations i've described?
[19:30:55] <cheeser> yes
[19:37:23] <easytyping> any simple reason why i shouldn't use redis or another nosql implementation?
[19:50:02] <daidoji> easytyping: Mongo is easier to use mostly
[19:50:11] <daidoji> easytyping: way easier to administrate as well
[19:50:59] <daidoji> easytyping: but there are numerous pros/cons for all major NoSQL databases
[19:51:08] <lpghatguy> I find myself emulating MongoDB in a traditional RDMS environment so I decided to just move to MongoDB instead
[19:52:03] <GothAlice> Any time you have relational data and think to use single-table inheritance, multi-table inheritance, split-table inheritance, or entity-attribute-value storage… MongoDB will probably solve your problem more easily and efficiently. ;)
[19:57:12] <easytyping> thanks guys
[19:59:17] <GothAlice> I've got one client who stores PHP serialized blobs in a relational database, with some really funky columns to extract the deeply nested information he wants to search on. I nearly had a fit when I saw him doing that. I still nearly have a fit just *thinking* about it.
[20:06:10] <drags> according to the docs it would appear that anytime a node is removed from a replica set that an election will take place to ensure the primary is correct
[20:06:13] <drags> (http://docs.mongodb.org/v2.4/tutorial/remove-replica-set-member/)
[20:06:26] <drags> does this happen even when removing a hidden&non-voting secondary?
[20:06:47] <drags> what kind of interruption to service would I expect from this election cycle?
[20:07:22] <GothAlice> drags: Under normal operation the election cycle is over almost instantly as the previous primary announces its opLogTime as being ahead of everyone else's, and thus the only viable choice.
[20:07:59] <drags> GothAlice: also I should mention that my primary has a higher priority
[20:08:28] <drags> should I expect any of my client connections to drop? working with a finicky middleware that has had trouble with nodes dropping out of the cluster in the past
[20:10:18] <GothAlice> drags: Elections only consider alive hosts with the highest available priority, with choices sorted descending on opLogTime. (I.e. nodes w/ priority 10 will always win over nodes w/ priority 9.) Yes, you have to expect and handle disconnects; most client drivers should automatically re-connect.
[20:10:51] <GothAlice> drags: (When there's an election client connections are closed automatically. It would require testing to identify if this is also true even if the primary does not change hosts.)
[20:17:29] <drags> GothAlice: thanks
[20:20:54] <Guest65133> is there any documentation relating to starting mongodb config server automatically on rhel 7 using systemctl enable mongodb.service
[20:21:28] <Guest65133> im the process of deploying a sharded cluster however i want all this stuff to start automatically when the system starts and run as user mongod, not as root
[20:41:04] <blizzow> subject line for the chat room should reflect 2.4.12 is out now...
[20:53:23] <joaovagner> guys, it's possible using method where_gt (Mongo_DB Class), in type MongoDate?
[20:53:41] <joaovagner> sorry, channel wrong =(
[20:59:17] <maxamillion> so my nodes in my replica set ran out of space, I increased the storage and it started spewing errors in the logs, I tried to repair and I got this http://fpaste.org/145626/41444345/ ... anyone have any suggestions?
[21:00:59] <GothAlice> maxamillion: Ensure you have free space equal to 2.1GB + the size of your existing dataset, and run mongod with --repair.
[21:01:13] <GothAlice> If that fails the next step is to hope your backups are recent.
[21:02:07] <maxamillion> GothAlice: I have backups from this morning, was just hoping there might be a way to recover without going to them .... should I stop mongod on all nodes in the replica set and then run mongod --repair on the primary?
[21:02:36] <GothAlice> Guest65133: I believe the best place to ask about distribution-specific setup would be the distribution support channel. I can automate Gentoo, not RHEL.
[21:04:34] <GothAlice> maxamillion: I have concerns over the fact that the broken machine still thinks it's primary, but yes. In principle stopping the cluster, repairing the primary, and starting back up may work.
[21:04:51] <maxamillion> GothAlice: thanks
[21:05:12] <GothAlice> maxamillion: Note that you are in the "danger zone" right now—your replica set *might* start up and pick a different primary, thus carefully avoiding the repair. ;)
[21:05:13] <maxamillion> GothAlice: well the errors were first seen on the secondaries trying to sync to the primary, then I tried the repair on the primary and it didn't go well
[21:05:26] <maxamillion> GothAlice: ah, nice
[21:05:27] <maxamillion> :/
[21:07:48] <GothAlice> maxamillion: After mongod --repair I'd recommend an immediate mongodump of the data recovered, prior to re-election, if possible. (Since if a secondary does get promoted, the original primary will re-sync to the newly elected.)
[21:08:06] <maxamillion> GothAlice: will do, much appreciated
[21:20:26] <pcoramasionwu> a
[21:47:11] <chovy> GothAlice: thanks
[21:48:00] <chovy> GothAlice: so adding a global admin will restrict all dbs to only be accessible by that admin user or any new users he creates?
[21:48:51] <GothAlice> chovy: After the addition of the first user authentication becomes "really enabled". Thus it's rather important that that first user be an admin.
[21:49:25] <GothAlice> But yes; from that point onwards a user must be authenticated to do anything with any database or collection.
[21:49:25] <chovy> GothAlice: ok
[21:49:44] <chovy> are there any good admin tools for mongo?
[21:49:57] <GothAlice> chovy: http://mms.mongodb.com ;)
[22:34:07] <mgeorge> so selinux in centos 7 is blocking mongod from writing to /data/configdb even with mongod:mongod as 775
[22:34:10] <mgeorge> disable selinux and it works fine
[22:34:27] <mgeorge> is there a specific command i can execute to enable mongod access to /data/
[22:42:03] <chovy> GothAlice: i mean a gui i can install on my laptop to manage a mongo db. right now i'm using robomongo (mac)
[22:43:54] <GothAlice> chovy: Robomongo is… not the most polished. Unfortunately I do not use frontends to admin my DBs; pymongo and the mongo shell for me.
[22:44:39] <GothAlice> mgeorge: You may have better luck askin in #linux about modifying selinux rulesets.
[23:07:04] <GothAlice> chovy: Also a huge amount of shell scripting. A la: https://twitter.com/GothAlice/status/421887277476749313 (identify files changed by commit, then pacakges owning those files, then all files owned by *those* packages, then filter for system initialization scripts as a prelude for calling "reload" on the ones with changed files.
[23:07:57] <GothAlice> That, right there, is the fun part that lets me push /etc/mongod.conf changes and have mongod automatically pick the change up in the cluster. :)