PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 10th of December, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:00:26] <Boomtime> modulus^: this is not the channel you are looking for
[00:00:39] <Boomtime> do you have a mongodb question?
[00:02:50] <modulus^> Boomtime: i've got some slow queries on my mongo servers
[00:02:57] <modulus^> Boomtime: queries that take 30k ms
[00:03:42] <modulus^> Boomtime: i have only one 8 terabyte large collection
[00:16:15] <joannac> modulus^: what does the explain say?
[00:16:37] <joannac> modulus^: also, what do the lgos say interms of nscanned, nscanned objects, index use, time in read lock, etc?
[00:18:52] <Boomtime> modulus^: run a query with .explain() and pastebin/gist the output
[00:20:10] <bpap> will Mongo 2.8 allow you to choose storage engine on a per-collection basis? or is it one engine for the entire server?
[00:22:12] <joannac> bpap: per mongod instance
[00:22:32] <bpap> got it, thanks.
[00:23:24] <modulus^> i just have logfile entries like this:
[00:23:27] <modulus^> [conn212072] query VM.rec.files query: { query: { filename: "greet_522fb1d7e4b0d91b533cbef2.wav" }, orderby: { uploadDate: -1 } } ntoreturn:1 ntoskip:0 nscanned:1 keyUpdates:0 locks(micros) r:1420 nreturned:1 reslen:180 28571ms
[00:23:31] <modulus^> 28.5k ms
[00:24:02] <modulus^> that's a lot of milliseconds
[00:24:13] <joannac> that's queued behind something else
[00:24:22] <joannac> it spends 1420 microseconds in the readlock
[00:24:31] <joannac> it's waiting on something else
[00:24:42] <joannac> find the something else, that's the actual problem
[00:25:30] <modulus^> [conn210074] insert VM.rec.chunks ninserted:1 keyUpdates:0 locks(micros) w:28428610 28427ms
[00:25:34] <modulus^> is that the problem?
[00:25:50] <modulus^> no readlock output
[00:25:53] <joannac> yes
[00:26:10] <joannac> slow disks?
[00:26:54] <modulus^> 12 drives raid 1+0
[00:27:40] <modulus^> each drive is 4TB size
[00:29:22] <Boomtime> run mongostat, check lock%
[00:29:36] <aliasc> Hi people :)
[00:29:37] <Boomtime> it certainly looks like you are write-bound
[00:30:58] <modulus^> spiking to 40% locked for collection
[00:31:57] <modulus^> iostat shows a lot more blk_reads than blk_writes
[00:33:22] <modulus^> few qr up to 15
[00:33:25] <modulus^> zero qw
[00:35:16] <modulus^> looks like writes are blocking reads
[00:35:18] <modulus^> GAH!
[00:35:46] <modulus^> will document level locking increase performance in my case?
[00:42:15] <Boomtime> modulus^: it's hard to say, you have a 28 second single write - it doesn't matter how parallel the database is if it can't write to disk
[00:43:13] <Boomtime> you say you're only getting 40% lock, but that just means that we haven't observed the problem - a 28 second write that hits only a single document is ridiculous
[00:45:15] <modulus^> the higher the queued reads, the higher the active writes count
[00:45:48] <modulus^> maybe it's better to have more smaller sized drives in the raid1
[00:47:16] <Boomtime> modulus^: capture the output from mongostat for a minute or so, and pastbin/gist it
[00:47:28] <Boomtime> needs to be at least a minute actually
[00:49:35] <modulus^> pastie doesn't allow txt file upload
[00:53:13] <modulus^> Boomtime: http://pastebin.ca/2884376
[00:53:16] <modulus^> look at that
[00:53:33] <modulus^> 142 seconds of mongostat
[00:54:10] <Boomtime> cool
[00:54:17] <Boomtime> ok, so you are running basically from disk
[00:54:25] <Boomtime> look at the faults column
[00:54:53] <Boomtime> the number of faults is nearly equal to the number of ops, every operation page faults to disk
[00:55:14] <modulus^> yeah that's the nature of the data
[00:55:29] <modulus^> cache/memstore is not really an option that i'm capable of
[00:56:01] <Boomtime> your write through-put is shockingly bad, it takes just 25 writes/s to hit 60% lock
[00:56:21] <modulus^> Boomtime: i hear ya
[00:56:27] <Boomtime> what are the disks?
[00:56:33] <Boomtime> like, manufacturer, speed, etc
[00:57:49] <modulus^> Interface Type: SATA
[00:57:49] <modulus^> Size: 4000.7 GB
[00:58:05] <modulus^> PHY Transfer Rate: 6.0GBPS
[00:58:15] <Boomtime> spinning disks then.. what spindal speed?
[00:59:30] <modulus^> Spindle Speed, 7200RPM
[00:59:50] <modulus^> manufacturer: HP
[01:01:25] <Boomtime> i'm a little surprised it is as bad as we're seeing.. i suspect it all comes down to seek time
[01:01:53] <Boomtime> the trouble with seek time is that it doesn't improve no matter how many disks you add
[01:02:19] <Boomtime> (well, unless you add so many disks that their individual onboard caches collectively store everything)
[01:03:10] <modulus^> i guess 7.2k is crapping out on us
[01:03:56] <Boomtime> it may also be partly the controller, a RAID controller can, at best, make the seek time for read/write unchanged - or, it can make it worse
[01:04:20] <modulus^> let me see what raid controller...
[01:23:04] <modulus^> Boomtime: would 14k spindle drives improve io alot?
[01:24:06] <Boomtime> maybe..?
[01:24:16] <Boomtime> i think you should do some testing
[01:24:47] <Boomtime> outside of mongodb, use a disk testing tool to thrash the disk around a bit and observe which metric is letting you down
[01:25:30] <modulus^> what do you mean thrash?
[01:26:21] <Boomtime> a disk IO testing tool - one that will read/write some random data, in varying sized blocks, in varying places, and give a performance report
[01:26:41] <modulus^> http://www.deploymentresearch.com/Research/tabid/62/EntryId/135/For-the-fun-of-it-Benchmarking-the-HP-Smart-Array-420i-controller.aspx
[01:26:46] <modulus^> someone already did that
[01:27:38] <Boomtime> someone broke in and ran some disk tests on your machine for you?
[01:27:44] <Boomtime> that was kind of them
[01:28:22] <modulus^> same raid controller there
[01:28:33] <Boomtime> that means little unfortunately
[01:29:55] <Boomtime> raid controllers are highly configurable (does yours even have the same bios revision as his?) and that is without considering the disks, the OS, other software or motherboard that backs it all
[01:30:14] <Boomtime> unless you have done performance tests yourself, on your own hardware, you have no idea what your performance actually is
[01:31:20] <Boomtime> "IOs/sec:  2889.47
[01:31:20] <Boomtime> MBs/sec:   180.59"
[01:31:42] <Boomtime> 64KB blocks on RAID1.. spinning disks
[01:31:53] <Boomtime> you aren't even within an order of magnitude of this result
[01:32:20] <Boomtime> something is different/wrong, you need to test it yourself
[01:33:22] <modulus^> # cat /sys/block/sdb/queue/scheduler
[01:33:22] <modulus^> noop anticipatory [deadline] cfq
[01:33:31] <modulus^> is deadline ok scheduler for mongodb?
[01:33:50] <modulus^> the secondary mongos are using cfq
[01:35:35] <Boomtime> don't know, sorry
[01:44:42] <HMill> Hello, all. Is there a way to get mongo to return a cursor from collection.aggregate() in v2.4.X?
[01:45:46] <joannac> HMill: no
[01:46:37] <joannac> upgrade to 2.6, or make sure your aggregation output doesn't exceed 16MB (and intermediate stages don't exceed 32MB)
[01:46:47] <HMill> joannac: OK, I'm sol :)
[01:47:02] <HMill> joannac: If I could upgrade easily, I already would have. :)
[01:47:14] <modulus^> rolling upgrades are where it's at
[01:47:18] <HMill> joannac: Thanks for responding.
[01:49:10] <joannac> HMill: why can't you upgrade, out of curiousity?
[01:49:57] <HMill> joannac: It's for a meteor app. Right now, stuck at 2.4.x.
[01:50:16] <HMill> I can use a new version of mongodb, but it can be a real pita.
[01:50:33] <HMill> newer*
[01:50:33] <modulus^> HMill: you need a devops engineer
[01:50:43] <HMill> modulus^: haha, sure, sure
[01:50:47] <modulus^> they can create rpm with custom init script to make that easy for you
[01:51:07] <HMill> I can devop!
[01:51:13] <HMill> I'm kidding. it'd be nice
[01:51:45] <modulus^> should take 5 minutes to upgrade a mongo instance
[01:51:51] <joannac> Based on https://github.com/meteor/meteor/issues/2036, they're not really supporting 2.6 yet
[01:51:58] <joannac> so yeah, i think you're stuck
[01:52:00] <modulus^> 2 hours to compile/rpm package the new version
[01:52:08] <HMill> joannac: right :\
[01:54:47] <HMill> I'm assuming no one in here is using Meteor?
[01:55:00] <HMill> modulus^: Are you devops ninja?
[01:55:17] <modulus^> HMill: no just a regular ninja
[01:55:51] <HMill> modulus^: that's cool. i'm down with regular ninjas
[01:58:39] <modulus^> real ninjas look like harmless fuzzy bunny rabbits
[01:58:50] <modulus^> until chuck norris pisses one off
[03:57:45] <kataracha> if querying on _id should there be any difference between the time it would take to query a very large collection compared to a small one or are they both constant lookup regardless?
[03:58:37] <cheeser> not constant, no, but fast
[04:05:02] <kataracha> but equally fast?
[04:09:21] <GothAlice> No.
[04:10:28] <GothAlice> kataracha: Indexes are generally stored as b-trees. This gives them predictable performance, ref: http://stackoverflow.com/questions/4694574/database-indexes-and-their-big-o-notation
[04:19:20] <bmillham> Hi all. I have some questions about the best way to keep a remote MongoDB in sync with a local copy.
[04:19:40] <bmillham> Some background, I currently have a MySQL database (local and remote)
[04:20:10] <bmillham> New records are added locally, and I run a script to update the remote.
[04:20:55] <Boomtime> bmillham: replica-set
[04:20:58] <bmillham> It's a site for listeners to my internet radio show to make requests on.
[04:21:39] <Boomtime> be sure to make the priority zero for the remote secondary so it is never a valid option for primary
[04:21:43] <bmillham> So when I'm DJing, they make requests at the site, which I'd like to have updated locally
[04:21:51] <GothAlice> bmillham: At work we use a pair of replica secondaries, one in the office, one in my apartment. The one in my apartment is purely for backup purposes (it's actually delayed 24 hours to assist with data recovery in the event of deletion) and the one in the office gets queried.
[04:22:20] <bmillham> And my local system is NOT connectible from the outside world
[04:23:20] <bmillham> And hi Boomtime
[04:23:41] <GothAlice> bmillham: You can do interesting things like run the secondary's connections through an SSH tunnel. :)
[04:23:49] <GothAlice> (Or real VPN, for more reliable service.)
[04:24:29] <bmillham> I was thinking of an SSH tunnel, but SSH is painfully slow on HughesNet
[04:26:51] <GothAlice> Well, you'll want an encrypted connection.
[04:26:58] <GothAlice> Ain't nobody got time for passive sniffing attacks.
[04:27:18] <bmillham> Can a replica-set be setup so that my local system periodically queries the remote for updates?
[04:27:26] <GothAlice> Yes.
[04:28:00] <GothAlice> bmillham: Just make sure your "oplog" size is sufficient to cover the period of time between "syncs" (plus a fair bit of head room for safety), and your in-office secondary can be spun up when you need it.
[04:28:13] <bmillham> With a SSH tunnel, I can't open any ports. That's just how HughesNet works.
[04:28:24] <GothAlice> Then SSH can't be used in your case.
[04:28:36] <GothAlice> One might suggest finding a better host, since… that's kinda basic.
[04:28:59] <bmillham> Opps, I miss-stated that. An SSH tunnel would work. I ment without a tunnel.
[04:29:09] <bmillham> And the only other host here is dialup
[04:32:05] <bmillham> As you have have guessed, I live out in the country. But it's sad that I live 60 miles from Washington DC, and there is no high speed internet available here other than satellite.
[04:32:17] <bmillham> *may have guesssed
[04:34:30] <bmillham> Unless I want to pay an insane amount for a DS1/3
[04:34:58] <GothAlice> Ah, not quite what I meant.
[04:35:04] <bmillham> (And I not sure if even those are available from the local central office)
[04:41:28] <bmillham> OK, looking at replica-sets, it looks like that won't work for what I'm hoping for. The remote replica can't accept updates. And I need that.
[04:43:44] <bmillham> (Updates from the app running on the server that the replica is located on)
[06:18:53] <kenalex> hello guys
[06:22:59] <Boomtime> hi there!
[06:27:33] <kenalex> i am new to mongodb and current testing it out. When is mongodb preffered over a RDBMS ?
[07:04:53] <sabrehagen> other than semantic meaning, is there value to storing an id field of a foreign object as ObjectID type rather than string?
[07:17:37] <logic> morning..
[07:18:03] <logic> How do I update my Schema with a new path in mongo shell ?
[07:21:26] <logic> in mongoose, I could use new Schema.add({path: 'type'}) but I would like to do it from mongo shell
[07:21:28] <linocisco> hi all
[07:21:44] <linocisco> what programming language is best to work with mongodb?
[07:21:44] <borjagvo> Hi. It seems I found a bug: http://stackoverflow.com/questions/27381041/text-search-not-working?noredirect=1#comment43219076_27381041. The $search doesn't work with the word "mesías" for example. Both words are not stop words.
[07:22:26] <borjagvo> Interestingly, I didn't see none of these two words in the spanish stop words list: https://github.com/mongodb/mongo/blob/master/src/mongo/db/fts/stop_words_spanish.txt
[07:24:46] <logic> javascript linocisco
[07:25:22] <linocisco> logic, is it complete programming language? Node.js is one form of javascript. It is confusing
[07:25:54] <logic> borjagvo, create a text index in a multi language
[07:26:36] <borjagvo> logic: I though I just did that: http://stackoverflow.com/questions/27381041/text-search-not-working?noredirect=1#comment43219076_27381041
[07:27:53] <logic> linocisco, learning more about fullstack will make it clearer to you.
[07:28:30] <linocisco> logic, fullstack is the name of programming languages? i have never heard
[08:13:24] <kakashi__> hi
[08:15:00] <kakashi__> when we apply aggregation...if mongos collect all the data from mongod?
[08:20:02] <Climax777> Hi all. Now that wiredtiger compression is to be available, how big effect does long field names have now?
[09:06:24] <Streemo> will $or:[{field:10},{field:10}] match anything with {field:10} am I allowed to fall back on redundancy like that?
[09:08:01] <Streemo> yes i am
[09:08:03] <Streemo> allowed
[09:08:07] <Streemo> i tested it
[09:43:08] <linocisco> mongoDB is NOT Mongolia DB.
[09:44:11] <linocisco> is mongoDB used for Banking transactions?
[09:51:46] <kali> linocisco: i would not recommend it
[09:52:17] <Derick> banks do use it though
[09:56:18] <Industrial> Hi. How do I rename a database?
[09:57:00] <Industrial> oh, https://jira.mongodb.org/browse/SERVER-701
[09:58:30] <linocisco> kali, why? not stable or not safe?
[10:03:06] <kali> linocisco: mongodb atomicity is limited to one single document update. so typical banking transaction scenario get difficult to implement correctly
[10:15:21] <linocisco> kali, what is one single document update? is it only ok for one transaction at a time?
[10:28:37] <kali> linocisco: how deep is your understanding of mongodb so far ?
[11:47:03] <alexi5> hello
[12:41:07] <borjagvo> Hi. It seems I found a bug: http://stackoverflow.com/questions/27381041/text-search-not-working?noredirect=1#comment43219076_27381041. The $search doesn't work with the word "mesías" for example. Both words are not stop words.
[12:41:23] <borjagvo> Interestingly, I didn't see none of these two words in the spanish stop words list: https://github.com/mongodb/mongo/blob/master/src/mongo/db/fts/stop_words_spanish.txt
[12:57:35] <borjagvo> Any help please from any person involved in the project? I posted some more details on the comments: http://stackoverflow.com/questions/27381041/text-search-not-working?noredirect=1#comment43219076_27381041
[13:36:58] <rioch> I have a document like so: {'field1' : 'value', 'field2': [ {'key': 'a', 'name': 'a'}, {'key': 'b', 'name': 'b'}, {'key': 'c', 'name': 'c'}]}. How can I update it so that all key fields are set to null?
[15:10:43] <Constg> Good afternoon, I have a question: How would you update ($increase) a value in an object which is nested in a array?
[15:13:12] <Constg> haaaa $ operator! I forgot this one :(
[15:13:16] <Constg> Thank you, me.
[16:18:26] <aliasc> Hi,
[17:19:51] <harttho> Anyone have any tips for finding slow queries?
[17:20:11] <harttho> Specifically, we have some slow queries run and the performance of un-related ones are affected too
[17:20:22] <harttho> So the logs/profiling show them all as slow
[17:26:15] <Constg> harttho, db.currentOp() will list queries
[17:26:29] <Constg> And you have a params to see for how long they're running
[17:36:11] <rchickenman> I am running a very simple query that returns 20 or so simple documents with no filters, and it's taking 30 seconds in Node.JS, but only a few milliseconds when I do it from the Mongodb shell.
[17:36:28] <rchickenman> Can anyone please help explain this behavior?
[17:41:39] <GothAlice> rchickenman: Is that MongoDB shell running on the same host as the server itself? Also, what are you doing with that data, and is it overly large? (I.e. is each document multi-megabyte?)
[17:41:54] <GothAlice> Network transfer is a likely culprit.
[17:42:23] <rchickenman> No, each document has roughly five simple string fields.
[17:42:32] <rchickenman> And the mongodb shell is running on the same server as the node.js code.
[17:42:53] <aliasc> Hello
[17:43:05] <aliasc> anyone here
[17:43:15] <GothAlice> rchickenman: Could you gist the relevant query handling code from both the JS and shell?
[17:43:29] <GothAlice> aliasc: Nope. It's turtles all the way down. :)
[17:44:28] <rchickenman> Also I should mention I'm using Mongoose.
[17:45:42] <rchickenman> Story.find().lean(true).select('url').exec(function(error, stories) { console.log(stories) })
[17:45:48] <rchickenman> Very simple query.
[17:46:56] <rchickenman> I tried using the "lean" feature because my understanding is that it removes whatever overhead mongoose usually imposes.
[17:50:45] <GothAlice> rchickenman: Hmm. If it's working from the shell, but freaking out in Mongoose, sure sounds like a Mongoose issue. Have you asked in #mongoosejs? (They'd be more likely to be able to assist… I don't JS, so my help here is limited.)
[17:56:22] <rchickenman> Okay, before I go there I might take some time to try it with the vanilla node.js mongodb-native driver to isolate it as a node or a mongoose issue.
[17:56:25] <rchickenman> Thanks for you help!
[18:09:45] <GothAlice> Welp, I now have a complete caching decorator library (both general memoize and Document-aware method decorator) for Python built on MongoDB ready for extraction from my work codebase. Mondays suck, but Tuesdays are very productive. :3
[19:58:47] <GothAlice> Does anyone know if MMS can be told to not stream-backup specific collections? I'd like to not have my cache (which can be rebuilt in its entirety or lazily) pointlessly transferred around. :/
[19:58:53] <GothAlice> joannac: ^
[20:00:11] <cheeser> GothAlice: asking the backup team now
[20:00:19] <GothAlice> cheeser: Great, thanks! :D
[20:01:25] <cheeser> from one of the devs: on the backup page in the gear box of options on the right there should be a "Manage excluded namespaces". adding to that should take effect shortly
[20:02:52] <cheeser> it's still going to read changes to those collections from the oplog collection, but the agent will stop forwarding them onward
[20:33:06] <mike_edmr> hey
[20:33:17] <mike_edmr> are unordered writes faster than ordered?
[20:36:10] <michaelq> Quick Mongoose.js question: User.count({}) is returning [Object, object]. How do I get it to instead return the results of the query?
[20:39:28] <ejb> Can anyone recommend some mongo based job queues? I'd like to queue jobs from meteor and process them on some other server(s)
[20:39:58] <Synt4x`> is there an easy way from mongo shell to output my .find (about 2,500 results) into a CSV or something that's easier to scroll through and look at?
[20:40:09] <mike_edmr> ive used monq but its not exactly.. top o the line
[20:40:18] <cheeser> mongodump with a query
[20:40:32] <mike_edmr> there is a bit of cruft in the "schema"
[20:41:12] <Synt4x`> cheeser: thanks I'll look into it, I thought mongodump was to save a whole DB
[20:42:42] <ejb> mike_edmr: can monq be used across servers?
[20:43:11] <mike_edmr> sure. it marks a job as in-progress when you pop it off the queue.
[21:05:50] <GothAlice> ejb: Task queues in MongoDB are easy to roll for yourself: https://gist.github.com/amcgregor/4207375 is an extract from a presentation I gave on the process, with full Python implementation (supporting immediate and scheduled tasks) linked in the comments.
[21:07:43] <joannac> Synt4x`: you probably actually want mongoexport
[21:09:18] <cheeser> yeah. i think dump dumps to bson. export will do json/csv/tsv
[21:09:42] <joannac> just don't try and insert it again, it might not preserve types
[21:10:29] <ejb> GothAlice: Excellent. Thanks
[21:14:35] <d-snp> hi, we have a high-write low-read environment, and I'm thinking of splitting our databases up to a database per customer, will this mess with the write throughput?
[21:14:52] <GothAlice> d-snp: It'll improve it, generally.
[21:15:17] <d-snp> yeah? what are the reasons for it improving? is it because of the smaller indexes?
[21:15:22] <GothAlice> d-snp: MongoDB currently has a collection-level write lock. (Future versions will have document-level write locks.) Splitting into separate DBs = spreading the locking around.
[21:15:53] <GothAlice> (No contention when simultaneously updating records for different clients in your case.)
[21:16:14] <d-snp> right, that sounds like it makes sense
[21:17:06] <GothAlice> d-snp: Smaller indexes will help update/find performance, too. Even lookups by ID should be faster, though depending on data size the difference might not be really measurable.
[21:17:13] <d-snp> I'm actually not sure if locking is an issue, it's not really important that the writes go through immediately, as long as the final throughput is optimal
[21:17:28] <GothAlice> Define: "final throughput"
[21:18:20] <d-snp> well, I was fearing maybe because the writes wouldn't be to the same file anymore, there would be less sequential writes which might reduce performance
[21:18:49] <d-snp> is it normal that locking is a bigger bottleneck than the ssd?
[21:20:06] <d-snp> I'm not a db expert, so maybe I'm not making sense :P
[21:21:33] <GothAlice> Locking is an issue if you have high "waiting for write lock" times or write lock percentages. Read-intensive databases have low percentages here, write-intensive databases (where a write is likely to happen before a previous one finished) will have a high percentage. (Sometimes >50%.)
[21:22:21] <GothAlice> When you're spending so much time literally doing nothing and waiting, splitting across DBs can eliminate much of the waiting, and thus utilize disk IO more efficiently.
[21:22:34] <d-snp> ok
[21:22:53] <d-snp> is random writes to an SSD a non-issue?
[21:22:59] <GothAlice> (SSDs are truly random access devices compared to spindle drives with a physical head that needs to move.)
[21:23:04] <d-snp> cool
[21:23:27] <d-snp> well, that certainly strengthens my case, I'll be able to defend my move to the CTO now I think :D thanks :P
[21:23:37] <GothAlice> It never hurts to help, d-snp. :)
[21:25:51] <d-snp> I hope rearchitecting our stuff to support a db per customer isn't going to be too hard
[21:27:47] <d-snp> do you know any companies who work like this? maybe I should build a middleware that makes it sort of transparent
[21:28:02] <GothAlice> d-snp: We're actually preparing to do just this at work.
[21:28:40] <GothAlice> (Data colocation laws in Canada require single-tenant databases for some of our clients.)
[21:29:16] <d-snp> heh nice
[21:29:35] <GothAlice> In our case, choice of database is derived from the requested host name over HTTP.
[21:30:32] <d-snp> so you have the logic for choosing a db in your webapp?
[21:31:09] <GothAlice> Aye; it's work-in-progress WSGI middleware sandwiched between the overall transactional middleware and the application proper.
[21:31:43] <d-snp> cool
[21:32:28] <GothAlice> (This middleware also swaps out master templates for CNAME whitelabling purposes.)
[21:34:55] <d-snp> hmm do you plan on having the databases all hosted by the same mongod/s instances, or instances per database?
[21:35:13] <GothAlice> Same mongo cluster, but with authentication controls enabled.
[21:35:34] <d-snp> so you'd have databases with unique names right?
[21:35:48] <GothAlice> Aye. Based on the top and second-level domain. I.e. example_com
[21:36:03] <GothAlice> Where the CNAME is set up as something like "jobs.example.com" or "careers.example.com".
[21:36:04] <d-snp> right, that makes it pretty easy
[21:36:34] <d-snp> any specific reason for not doing a cluster per customer? or just sysadmin overhead?
[21:36:53] <GothAlice> Most clients don't have datasets worthy of that level of isolation and independent scaling.
[21:37:19] <GothAlice> (I.e. we have some clients with 10,000 jobs registered… and they're still on the shared infrastructure, since that comes to about 20MB of data. ;)
[21:37:32] <d-snp> :P
[21:38:05] <d-snp> our data is in the 10s of gb's per customer, and that's the small ones unfortunately
[21:40:14] <GothAlice> You mentioned a write-heavy load. Since your dataset for a single client won't fit in RAM anyway, the only benefit of splitting into a cluster-per-client is to have cache locality. (I.e. data cached in RAM for client A that client B's data can't push out if split, but can if shared.)
[21:48:51] <stick__> when using morphia
[21:49:06] <stick__> in which case would you use the morphia class, etc
[21:49:26] <stick__> and in which case the UpdateOperations ?
[21:56:41] <cheeser> stick__: come again?
[21:57:00] <stick__> cheeser: I got cheese
[22:03:37] <boxmein> hiya, stupid question but how the hell do I query for a list of possible parameter values efficiently?
[22:04:13] <boxmein> as in, collection.find({author: [a, b, c, d, e]}).limit(N)
[22:05:54] <boxmein> oh, magical $in, thank you for existing
[23:00:27] <MrDHat_> I am trying to use bulk find operation to query using an array like this:
[23:00:27] <MrDHat_> bulkOp.find({
[23:00:27] <MrDHat_> 'phone.number': {
[23:00:27] <MrDHat_> $in: phoneNumberArr
[23:00:28] <MrDHat_> }
[23:00:30] <MrDHat_> });
[23:01:08] <MrDHat_> Sorry should've used a paste service
[23:01:11] <MrDHat_> Here: http://hastebin.com/ulofokexaj.vbs
[23:01:37] <MrDHat_> This query is not returning any results
[23:02:53] <MrDHat_> But the same query works when db.collection is used
[23:03:11] <MrDHat_> s/db.collection/db.collection.find