[01:25:28] <dgarstang> Need some help with mongo. I've enabled SSL and now two shards aren't working anymore... :(
[01:40:24] <abstrakt> what's the best way to find out how much time a given query takes
[01:40:42] <abstrakt> when I have e.g. about 40,000 records, mongo wants to page them by default inside of the mongo console
[01:41:26] <abstrakt> basically I want to know which part of the 3 seconds it takes to deliver my JSON through my API is taken by my application layer and which part is the query
[01:41:40] <abstrakt> e.g. how long does the query take vs how long does it take my application layer to render those results as JSON
[01:48:15] <abstrakt> I suspect my application layer, but I'd like to confirm that, just not sure how to tell how long the query took
[02:19:53] <wojcikstefan> hey, do you know what's the acceptable replication lag? I'm trying to figure out when a member changes its state from SECONDARY to RECOVERY (i.e. how big the replication lag has to be for that to happen).
[03:52:40] <geardev> guess i'll go cry then read the docs a little more
[04:54:50] <voxadam> I'm attempting to install mongodb on a Debian sid box but it fails to start. Does anyone have any thoughts? http://pastebin.com/WN7yqExJ
[05:27:23] <abstrakt> hmm, so I just made a super simple application with express, and I'm trying to deliver approx 50,000 records but I get no data back if I set limit to 100000
[07:02:35] <Ponyo> Is there a processor more effective at the the workload mongo presents than the Intel Xeon?
[08:04:36] <amagee> hey i just upgraded my ubuntu vps and now when i start mongodb it seems to terminate straight away.. logs here http://pastebin.com/cKzpntRq .. any ideas?
[11:02:14] <Shapeshifter> Hi. I'm writing an application which does large scale graph computations. I need to store the resulting data somehow and I'm thinking I could use mongodb, but I'm not quite sure how to design the documents. Some facts: 1) Every node in the graph will carry some data which needs to be persisted. 2) The data of many nodes might be identical. 3) nodes may be added and removed from the graph, after which the computation will be re-run, ...
[11:02:20] <Shapeshifter> ... which will cause a relatively small number of nodes to have changed data, which again needs to be persisted.
[11:03:42] <Shapeshifter> I'm thinking I could have a data collection which stores data, identified by some hash. Every node could reference one of these data items (so that nodes with equal data all point to the same data document). That part is relatively clear
[11:04:52] <Shapeshifter> I'm not so sure about the versioning. Basically I would need to be able to query for "all data documents which represent data of a graph of revision XYZ", but many nodes may not have updated their data for that revision, so the data of an older revision would need to be used.
[11:05:46] <Shapeshifter> So for every node, I would need to query "is there a data document for revision n?" and if not, query "is there a data document for revision n-1?", and again n-2, n-3 etc. until I find something, but this sounds slow.
[11:11:50] <rspijker> Shapeshifter: you could just use $lte (less than or equal) in your query…
[11:12:35] <Shapeshifter> rspijker: but I would always need the newest revision. So if a node has 3 different data, one at rev1, one at rev21 and one at rev25, I need the data from rev25 but not the others
[11:13:17] <rspijker> well, you could do a sort and limit, to get just the latest result
[11:13:53] <Shapeshifter> rspijker: is sorting expensive? I might have some 2-10 million data documents which I need (plus a few million more of older revisions which I wouldn't need in this query)
[11:14:09] <rspijker> it can be, but you can add an index for that specific field
[11:20:27] <kali> mongodb does not address graph specifically, but the computing difficulties will lurk and bite you for sure. neo4j may look scary because the hard part shows earlier... i don't know :)
[11:21:22] <Shapeshifter> the thing is that I don't really need to store the graph. The graph is actually source code, an AST, plus some extra edges, but I can always recreate the graph from the source code and the graph computation framework itself doesn't provide any form of persistence, so basically I only need to store data, but not the graph structure itself, which is present in the code.
[11:23:30] <Shapeshifter> The different revisions of the graph are revisions of a git repo. I'm actually thinking, I might store the data right there in the git repository to get free versioning (i.e. create a branch which contains the original code plus one data file for each source code file. Something like that. But I doubt it would work nicely.
[11:23:40] <kali> rspijker: a node in a DAG can have two (or more) parents
[11:23:59] <Shapeshifter> rspijker: you can have something like >--< in a DAG, which is not a tree
[11:30:42] <rspijker> which version of mongod is this?
[11:30:45] <Shapeshifter> kali: interestingly, the direction doesn't really matter even for an AST. e.g. in my representation, the arrows point from children to parents, but it might just as well be the other way around
[11:43:45] <Ravenheart> " Error: couldn't add user: User and role management commands require auth data to have schema version 3 but found 1 at src/mongo/shell/db.js:1004"
[11:43:50] <Ravenheart> yes i opened the shell inside the IDE
[11:43:53] <Ravenheart> and wrote the command myself
[11:44:28] <rspijker> yeah… so that’s still on v2.4
[12:03:48] <arussel> I need to update a collection using mongo shell script. What is the proper way to do it ? var cursor = db.foo.find() while(cursor.hasNext()){ var el = cursor.next(); db.foo.update({_id: el._id}) seems to create problem
[12:27:38] <rspijker> arussel: just do a .update ?
[12:28:17] <rspijker> without the cursor that is...
[12:32:04] <arussel> don't I have to use snapshot() ? I do see the same document more than once
[12:32:23] <arussel> I need the document to know how to update it
[12:35:13] <arussel> yeah, using snapshot solves the issue :-)
[12:35:24] <kali> arussel: yeah _id, or any index which make sense
[12:35:35] <kali> arussel: i mean, $snapshot is just a sort by _id
[12:40:56] <arussel> if I have {"_id":1, a: "a"} does db.foo.update({_id: 1}, {$set: {a: "a"}}) takes a write lock ?
[12:41:27] <kali> i think it does, but not for long :)
[12:44:46] <arussel> so it might be better to do a read before an update from the application instead of just throwing update at mongo hoping it can manages better
[12:57:39] <kali> arussel: i don't think you'll gain much. the lock will be held only the time needed to do the comparison, so it should be very fast. if you perform a client side find, you expose yourself to a race condition
[12:58:22] <kali> arussel: don't overthing too much about the write lock. it is only a problem in pathological cases
[15:59:01] <doug_> I was following this http://docs.mongodb.org/manual/tutorial/configure-ssl/#mongo-shell-ssl-connect but then I found http://docs.mongodb.org/master/tutorial/configure-x509/ ... Which one do I want?
[15:59:14] <saml> wait.. that looks like server to server
[15:59:22] <saml> you prolly need to consult ruby driver doc
[16:02:23] <doug_> Those two docs seem to say the same thing, but in confusingly similar ways
[16:03:10] <doug_> the ruby driver seems to basicall ignore the ssl_cert option when connecting.
[16:04:12] <doug_> if I enable ssl with CA verification, I can connect locally with the mongo command by supplying the PEM file. However, the ruby driver when given the ssl_cert fails
[16:04:57] <doug_> I can disable CA verificiation, but I don't know if that's good enough. No pem file is required, so how the heck does encryption even work? What's it encrypting against?
[16:06:10] <saml> yah https://github.com/mongodb/mongo-ruby-driver/blob/master/lib/mongo/client.rb it seems to use :database and :write only
[16:07:29] <doug_> the docs for the ruby driver are terrible. I cna't find mention of SSL at all https://github.com/mongodb/mongo-ruby-driver/wiki
[16:13:44] <doug_> This SUCKS. This will work... mongo -ssl -sslPEMKeyFile /etc/ssl/mongodb-qa.pem but when I pass ssl_cert of /etc/ssl/mongodb-qa.pem to the ruby driver it effin fails.
[17:07:13] <doug_> if I enable SSL in mongo... that encrypts between client and mongos.... but what about within the cluster? Is that then encrypted too?
[20:20:33] <adrian_berg> the stuff that works: https://pastee.org/wq6e4 the bluebird paste is still failing though, and that's what i'm trying to get help with
[20:23:21] <blizzow> How do I up the maximum number of connections my mongodb allows?
[20:35:20] <saml> blizzow, how many connections does your mongodb currently allow?
[20:47:07] <thevdude> I have a collection with entries very much like this: {_id: 1, team: "Team Awesome", abbr: "AWSM", players: ["AWESOMEPLAYER", "AWSMPSSM", "KICKAWES"]}, how can I replace one of the items in the "players" array without knowing which specific item it is with another given string?
[21:00:00] <thevdude> I figure it out, have to use the $ positional operator
[21:02:43] <blizzow> Derick: even if I set ulimit -n 64000 and put maxConns = 20000 in my mongodb.conf, db.serverStatus(); still returns a maximum of 16000 connections. :(
[22:02:57] <thdbased> Question about DB structure. Having posts documents in my DB, is it best to have the comments on the posts in the same document or in a separate collection?
[22:10:34] <cheeser> even if you don't get near the 16M limit, doc growth can cause movement on disk
[22:31:42] <whomp> every hour, i need to import about 30 million rows to a table from a csv file. each row of the file has a latitude and longitude value. how can i import them as one geojson object?
[22:33:04] <Derick> whomp: 30 million points in one document?
[22:36:12] <whomp> the row has four fields: lat, lon, value, and time. the objects are weather forecats: x temperature at y time in some place
[22:36:33] <Derick> okay, then yes, one document per row I'd say
[22:36:37] <whomp> i want to run queries on the data afterwards to find forecasts for the area around someone
[22:36:49] <Derick> so one geojson object (let's call it lat) with the lon, lat pairs in it
[22:37:06] <Derick> and then two other fields for time, and temp
[22:37:09] <whomp> ok, so back to the first question. how do i import csv quickly with two of the values used to create a geojson object?
[22:38:06] <whomp> or a point object if i'm confused. whatever a simple coordinate pair should be
[22:45:33] <Derick> a point should be a geojson object
[22:47:44] <Derick> as for importing, you probably should write a script in your favourite language
[23:59:51] <geardev> given these three files, how would you actually display the scores collection on the screen? Right now it's returning [object Object] https://pastee.org/ftkb