[02:17:37] <StephenLynx> what happened is that some dude found out that some people are dumb enough to screw up that and then made lots of noise blaming mongo.
[02:17:58] <StephenLynx> and put out some stupid number but I really doubt the numbers are correct.
[02:18:00] <symbol> Heh, that'd be a good ELI5 for that post.
[02:20:29] <StephenLynx> "Readers who are concerned about access to their systems are reminded of the following resources: the most popular installer for MongoDB (RPM) limits network access to local host by default"
[04:10:17] <terabyte> I have a mongodb instance whose performance appears to be slow. If this was relational i'd look at indexes, but this is doc based so.. Each document contains an "inventoryId" which is not unique, but every query ever made to the collection must provide at least this id. in addition to this, a query may provide other attributes to look up on (color, country) etc. queries are always of the "x=y"
[04:10:17] <terabyte> form, and never use inequalities... however the set of additional attributes they might query on is up to the user.. they might have all or some or none... given this would creating an index on the mandatory field inventoryId be useful in improving performance?
[04:13:37] <terabyte> so in short, would creating a single field index on inventoryId assist with the performance of queries such as "inventoryId=1&additionalField=2428"
[04:20:34] <terabyte> presumably for this it's best to have a hashed index yes? (because we do equality checks and nothing else)
[04:20:46] <terabyte> even if i'm not sharding my mongodb instances
[04:26:42] <Boomtime> terabyte: why would hashed be better?
[04:27:11] <Boomtime> hashed is useful to create a smooth distribution, which might be important under specific conditions
[04:27:39] <terabyte> I'm not sure, I thought hashed would be better because whenver a search takes place it will be performed under some subportion of the data.
[04:27:42] <Boomtime> for sharding a uniform distribution in the key is the easiest way to ensure balance
[04:28:03] <Boomtime> .. that doesn't really explain why hashed is better
[04:28:12] <terabyte> inventoryid is definitely not evenly distributed (there are more '100's than there are 275's for example)
[04:28:27] <Boomtime> i can't see any reason to use a hashed index from what you've said so far
[04:28:45] <terabyte> so I should just use a regular index?
[04:28:45] <Boomtime> do you need a uniform distribution for some purpose?
[04:28:57] <Boomtime> i would say just use a regular index
[04:29:08] <terabyte> i'm not doing anything fancy like sharding the instances for example
[04:29:39] <Boomtime> if you hash the index, you also lose the ability to do range queries - like "everything higher than X"
[04:29:49] <terabyte> just thought if i'm not doing a range query on that field, (and only equality) then i should use hash.
[04:30:15] <Boomtime> it probably won't make any difference for what you've said so far
[04:35:16] <terabyte> what's a typical time for an index to be created in the foreground if the data is about 10gb large (on an ssd) with 1gb memory. we talking 10 minutes or 10 hours?
[04:53:28] <Boomtime> depends on your queries now - if you hit a fair range of your data then much of the index will be required to be present in ram to be efficient
[05:03:54] <joannac> indexes are also not magic, if you want to return the document, the entire document still has to go into memory
[05:04:23] <joannac> so if you only have enough for indexes and not your working set, you're still going to be page faulting like crazy
[09:55:00] <dddh> vagelis: update_many vs update_one https://docs.mongodb.org/getting-started/python/update/
[09:57:40] <vagelis> Well it doesnt say if empty filter means replaces whatever document finds first but i guess it works the same like find({}) which returns all documents
[12:25:08] <StephenLynx> but does it automatically uses the proper int type of the it doesn't fit the default?
[12:25:13] <Derick> so you need to find which values are higher than that. Converting them to a string might work, but then you can't do calculations with them
[12:25:26] <Derick> StephenLynx: the driver should pick either Int32 or Int64 automatically
[13:11:51] <dhanasekaran> Hi Guys, I am getting Error : Failed with error 'chunk too big to move', from shard0000 to shard0001 how to fix this please guide me
[13:13:00] <cheeser> start here https://docs.mongodb.org/manual/core/sharding-chunk-migration/#jumbo-chunks
[13:13:56] <Zelest> how many servers are required for sharding?
[13:25:44] <pamp> How can I fetch documents by field type?
[13:27:35] <Derick> most _id fields should be type 7
[13:27:43] <dbh> currently i have a replica member syncing from another replica member and i'm curious if there is a way to get a sense of the progress
[13:28:03] <Derick> pamp: where did you get the type numbers from?
[13:28:22] <pamp> var t = db.Ericsson3GVertex_20151012.findOne() => t._id BinData(3,"MZL8eLK4rUGR437ySvR6/A==") => t._id instanceof Object true
[13:28:24] <Derick> dbh: iirc, it's in the secondary's log file
[13:47:46] <stondo> I created a replica set with 2 Linode servers, and I expected that secondary node would become primary if primary failed, but it doesn't work like that
[13:47:56] <golya> then either it does not work, or meteor is crap :)
[13:48:01] <stondo> is it possible to achieve that with only 2 servers setup?
[13:48:02] <Derick> stondo: no, you need 3 nodes for that to work
[13:48:10] <golya> array is the name of the property, which has array values, right?
[13:51:58] <StephenLynx> also, I failed at reading
[13:52:59] <Derick> stondo: it depends a little on whether they calculate internal data centre traffic under their caps...
[13:53:48] <stondo> Derick, I think that if I use private network, perhaps bandwidth won't be affected, but I have to check with Linode support to make sure
[13:54:59] <Derick> stondo: also wise to check to make sure you use private network for the replication then
[13:55:26] <Derick> and to check that the replication configuration uses externally accessible IPS (if your client is outside of their private network)
[14:37:00] <the_german> hi @ all. I have a lot of documents with arrays of geopoints (they are pretty much traces of some drive with a car). I want to use a 2dsphere index. Is there a elegant way to determine wether any of the geo-points lies within some defined geofence without iterating over all of the items in the array?
[14:41:27] <the_german> Derick: Yes I know that. And also within a circle etc. right? But I need to find out whether at least one point of a array is within the polygon e.g.
[14:41:39] <the_german> Derick: allright give me a sec
[14:42:28] <Derick> an array? what does your data look like? each document one GPS trace? and you want to find out which or your documents (with single GPS trace) has a point within a geofence (polygon/circle)?
[14:42:55] <the_german> Derick: I already found that link. But in this example they only have one geo point. I have multiple and need to check if at least one of them is in there. Is there a function in mongo that supports that or do I have to do that for every element of the array with e.g. $elemmatch
[14:43:14] <Derick> the_german: can you show me a document?
[14:44:47] <the_german> Derick: Unfortunatly not right now from my actual installation but...
[14:56:45] <kba> My server just restarted and suddenly mongodb wouldn't start, I was getting "[initandlisten] exception in initAndListen: 29 Data directory /data/db not found., terminating"
[14:56:46] <Derick> the_german: you should store it as a GeoJSON LineString really
[14:57:20] <Derick> and then you can use an intersect query
[14:57:26] <kba> I don't recall if I've rebooted the server since I installed mongodb, but either way I found /etc/mongodb.conf and it says: storage: dbPath: /var/lib/mongodb
[14:57:45] <kba> and in /var/lib/mongodb, the databases are definitely located
[14:58:16] <kba> so why is it that when I do "sudo monogod", it looks in /data/db and not /var/lib/mongodb as specified in mongod.conf?
[14:58:24] <the_german> Derick: Yeah I will convert it into GeoJSON LineString. But isnt the intersect query on one specific geo point only?
[14:58:33] <cheeser> because you're not telling mongod to use that conf file
[14:58:46] <Derick> the_german: you want to know which exact point in the line is within the geofence?
[14:59:15] <kba> cheeser: it's just curious that this problem just suddenly appeared now. I've never manually started mongodb before
[14:59:47] <krion> using mongo2.4, i just "switch" mongo primary <-> secondary in a replicaset, the goal is to upgrade from 2.4.9 to 2.4.14
[14:59:47] <Derick> kba: you probably should start it with something akin to "service start mongod"
[15:00:12] <krion> the thing is new secondary is still showing "syncingto" primary
[15:00:26] <krion> the message used to disappear earlier
[15:00:27] <kba> Derick: Indeed, but "sudo service mongod start" just gives "[FAIL] Starting database: mongod failed!"
[15:00:39] <the_german> Derick: Yes eventually. But my question right now is this https://docs.mongodb.org/manual/reference/operator/query/geoIntersects/ . Can loc be in those examples als be a LineString. Yes right?
[15:01:41] <kba> In /etc/init.d/mongod, I can actually even see that it specifies /etc/mongod.conf
[15:01:43] <the_german> Derick: I think you helped me out on the first part. But how do I only get the points within the fence?
[15:01:46] <Derick> the_german: which example exactly?
[15:02:18] <Derick> if you only want the points in a fence then hmm
[15:02:33] <krion> can i have a % of the syncing progress ?
[15:02:48] <the_german> Derick: Tricky I know. I am sorry that I bother you with this
[15:03:17] <Derick> the_german: i am not 100% certain whether you can (in one document) jave an array of GeoJson points work together with a 2dsphere index
[15:04:50] <kba> Derick: Figured it out. It just seems like it couldn't unlink the sock file. I assume the server crashed and it wasn't cleaned up probably after.
[15:06:14] <the_german> Derick: How would you recommend solving this? Just do it without an index?
[15:06:53] <Derick> so two solutions: 1. you need to store each point in a document; 2. do it in two phases: a $intersect with a LineString and Polygon geofence, and then filter out the points in your app
[15:06:59] <Derick> i don't think it works without index
[15:07:07] <Derick> or if it does, it will be *really* slow
[15:08:51] <the_german> Derick: The second one is actually how I do it right now. First I check with $elemmatch wheter a doc contains any point within the fence. After that i do $unwind on the array. And then filter out all the elements I dont need. At last a repack the array with $group. so I dont have to do it in the app
[15:09:54] <krion> is there an easy way to see the status of a "syncing" replicaset ?
[15:10:08] <krion> except rs.conf and rs.status() i'm not really use to it
[15:54:06] <christo_m> but i want to upsert not only the parent collection but also the array if necessary instead of just blindly pushing new items in
[15:56:26] <christo_m> StephenLynx: not sure if thta makes sense or not
[17:48:45] <fullstack> Whats the best approach to check database consistency, validate columns, check for missing references, deal with mis-matched JSON structures in a database?
[17:55:16] <fullstack> Thanks. Yes I think not putting garbage into the database is the best approach, Mongoose.Validate seems to be the best approach
[17:56:11] <fullstack> I'm curious if there are others. Absent of dumping the database, and then piping it through the schema again with a good Mongoose.Validate, then reporting on any validation errors.
[19:08:49] <symbol> I'm working on paginating based on the date of my blog posts. I'd like to return 25 results per page and then have a "More" button that only gets displayed if there are more in the database. I imagine I can't really do that with the native drive in a single query?
[19:09:32] <symbol> Right now I just pull one more document than I need and show the "More" if it's greater than the number but that generally leaves out the last doc if it's even.
[19:32:52] <garyserj> so for example the findone function in 1.4 http://mongodb.github.io/node-mongodb-native/1.4/api-generated/collection.html#findone Will that example work in all current versions of mongodb with node?
[19:33:22] <garyserj> and if not, then is that documented anywhere?
[22:26:13] <symbol> StephenLynx: Since you're familiar with node.js - I shouldn't ever see more than one connection in the mongod log right?
[22:32:22] <shoshy> how can i find all documents that their "result" field (which is an array) has an object that his "something" property equals "something else" ?