[14:39:52] <Derick> under res is shows mongodb is using 6.1GB now
[14:39:58] <BlakeRG> free -m tells me i have 68MB of memory left =(
[14:40:03] <Derick> are you per-chance using swapping as well?
[14:40:56] <BlakeRG> no swap used, 511 available 511 free
[14:41:30] <BlakeRG> i am just confused as to why i am still seeing writes to my instance when i turned off the app over like 5 minutes ago now
[14:42:29] <Derick> what does the mongodb log say?
[14:43:02] <Derick> with unacknowled writes the client app doesn't wait until things have actually been written and they're in a queue most likely.
[14:44:10] <BlakeRG> Derick: yeah, i am assuming thats the cause because this is an analytics platform so they probably aren't too concerned with ensuring the write happens immediately
[14:44:48] <Derick> k, writes still should take 2seconds though
[14:52:31] <BlakeRG> thanks for the pointers Derick, my sysamin isnt here to help me with this stuff so i dont have access to do a lot of things as well :(
[16:08:53] <orweinberger> I have a 14GB database, with a single collection that holds just over 8 million documents. I'm doing a very simple count for all documents that their timestamp is within the last 30 days. nothing fancy at all. The query takes ~1:10 minutes to run. I'm running it on a 2GB memory/8CPU machine. Is this performance considered to be within the normal time ranges?
[16:11:22] <Joeskyyy> orweinberger: Is it sharded?
[16:11:43] <BlakeRG> Derick: thanks again for the help easier
[16:13:47] <Joeskyyy> If you don't have an index, or it's not sharded (and therefore indexed on the shard key), you're going to see a performance hit.
[16:14:28] <orweinberger> I have an index on the 'timestamp' field
[16:16:18] <freeone3000> When calling rs.initiate(), I'm getting an error that some other databases already have data. They shouldn't, they're clean mongo installs. I open a console on each one, they have an empty collection named local, and a collection named admin with some data. I think `admin` should remain. How do I get a replica set initiated?
[16:16:33] <Joeskyyy> orweinberger: when you run your query with .explain() is it actually using the indx?
[16:16:57] <orweinberger> Didn't try, i'll try that. Got the line from the logs though, might give you some insight:
[16:20:08] <Joeskyyy> Hmm, yeah. It might not be using the index, I'd be curious to see what your .explain() says
[16:20:38] <Joeskyyy> i /think/ the hang up might be with your sort (see: http://docs.mongodb.org/manual/core/aggregation-pipeline/#pipeline-operators-and-indexes)
[16:21:05] <Joeskyyy> Since you're calling the $sort after $group, it doesn't use the index for sorting… but it /should/ use it for $match
[16:21:32] <freeone3000> Joeskyyy: Stopped mongodb, deleted /var/lib/mongodb/*, started mongo, same error.
[16:22:53] <orweinberger> Joeskyyy, thanks for your comments!
[16:23:22] <Joeskyyy> freeone3000: What's the exact error?
[16:24:15] <freeone3000> Joeskyyy: {"errmsg":"couldn't initiate : member 10.0.2.201:27017 has data already, cannot initiate set. All members except initiator must be empty.", "ok":0}
[16:26:04] <Joeskyyy> Something must still be there then. Otherwise it shouldn't be throwing that error.
[16:26:25] <freeone3000> Joeskyyy: Sure. But if it is, a `rm -rf /var/lib/mongodb/*` didn't get it.
[16:27:18] <Joeskyyy> freeone3000: Be careful when doing that. always make a backup of what you're doing (: good practice. so cp /var/lib/mongodb /var/lib/mongo.bak/ or something like that.
[16:27:34] <freeone3000> Joeskyyy: Again, it's a fresh install. There's no data on any of the DBs.
[16:27:43] <Joeskyyy> Can you so an ls -la on /var/lib/mongodb/ and ensure that it is indeed empty?
[16:28:29] <freeone3000> Joeskyyy: Looks like there's still a local.0, local.1, local.2, local.ns, mongod.lock, _tmp/, and journal/.
[17:40:10] <BlakeRG> so i would venture to guess that if a query took a few seconds to run, it would further delay running of the other queries in the stack until the current one finished
[17:41:19] <Joeskyyy> I'm going to venture to say yes? I'm not 100% sure with the node drivers though.
[17:41:28] <Joeskyyy> Maybe someone else has a better answer for you though :<
[17:41:41] <heewa> I'm having trouble getting my queries to hit an index on _id where the _ids are dicts like {'_id': {'t': Date(), 'S': ObjectId(..)}} and the query is like {'_id.t': Date(), '_id.S': ObjectId(...)}
[17:42:10] <heewa> Used explain and it's not hitting the index. But if I query for {'_id': 'yay'} it does use the index (though nothing matches...)
[17:43:08] <freeone3000> heewa: Did you try queries like {"_id":{"t":Date(),"s":ObjectId(...)}} ?
[17:45:49] <heewa> freeone3000: That also used the query, but didn't match the record.
[17:46:10] <Joeskyyy> heewa: Can you pastebin a getIndexes() from your collection?
[17:49:54] <Nodex> BlakeRG : the driver doesn't but ALL drivers pool the connections so it's possible there is a queue there
[17:49:57] <heewa> freeone3000: Joeskyyy: Got it to work. Multiple things: 1) I need to query by dict, not dotted selection of sub-items ({'_id': {'t': ... instead of {'_id.t': ...), 2) order of items in the query dict matters. {'_id': {'S': .., 't': ..}} yields result while {'_id': {'t': .., 'S': ..}} doesn't.
[17:50:42] <Joeskyyy> ^ Order has always been a thing yeah.
[17:50:46] <Nodex> dealign with order is easy - just a apply a sort to it
[17:51:07] <Joeskyyy> Mongo 2.6(?) or 2.8 will have the ability use multiple index in a single query.
[17:51:28] <heewa> Of a query? I knew order mattered for searching, which is why the pymongo driver only accepts a list of tuples like sort=[('key', 1), ('another': -1)], but for a query it takes a dict, which is inherently unordered.
[17:51:39] <heewa> Nodex: Can't sort a dict. It's unordered.
[17:53:31] <freeone3000> Nodex: You can sort a python list. Dicts are not lists.
[17:54:09] <Nodex> tbh I can't fathom any language that cannot sort a multi dimensioanl array / object
[17:54:17] <heewa> Nodex: it's not about python or a language thing, it's about data structures. Let's not get in a flamewar, but if you're curious, look up what a hash map is.
[17:54:18] <freeone3000> Nodex: Essentially, dicts are hash tables, whose sort order is dependant on the number of elements and the hashing function.
[17:54:40] <Nodex> heewa : not interested tbh, the problem is fixable if the language can do what it needs to
[17:54:45] <freeone3000> Nodex: Using perfect knowledge, it's predictable (otherwise, they wouldn't work), but it's not in any order you would generally considered "sorted".
[17:55:02] <freeone3000> Nodex: If you want a fixed iteration order, take your objects, put them in a list, sort that.
[17:55:04] <Nodex> either it can't or it can and you don't have the understanding of it that you need
[17:57:40] <Joeskyyy> PyMongo also supports “ordered dictionaries” through the bson.son module. The BSON class can handle SON instances using the same methods you would use for regular dictionaries. Python 2.7’s collections.OrderedDict is also supported.
[17:58:27] <freeone3000> ...And everyone else whose Maps are chaotically ordered.
[17:58:36] <freeone3000> Because putting a linkedlist in *every* hashtable is silly.
[17:58:42] <Joeskyyy> I live life dangerously and never index anything.
[17:58:55] <Nodex> why woudl you map an unmapped / unstrict schema?
[17:59:28] <freeone3000> Nodex: Java calls key-value collections "Maps", like Python calls them dicts, Javascript calls them objects, and PHP calls them arrays.
[17:59:55] <Nodex> my mistake, I don't know java, I assumed you were talking about some ORM crap
[18:00:08] <freeone3000> Oh no, we also do that, we just call it Morphia.
[18:01:10] <Joeskyyy> More like snorephia, amirite? … i need to be off the internet today.
[18:02:51] <Nodex> time to take my new DDJ-SZ for a test drive \o/
[18:18:40] <ismell> if I do an update that has {$unset: {a: true, b: true}, $set: {a: 3}} does $unset get ran first?
[18:24:42] <bitpshr> Hi everyone. I'm trying to do a simple insert of a GeoJSON point into a fresh collection: db.users.insert({'loc': {'type': 'Point', 'coordinates': [50, 50]}})
[18:25:27] <bitpshr> I keep getting this error: "location object expected, location array not in correct format". There isn't any outside context here, this is a fresh db and a fresh collection and is within the mongo shell. I created a 2d index on `loc` as well. What am I missing?
[18:54:09] <proteneer> are mutable, growing arrays something to be avoided?
[18:59:48] <badaroo> I'm new to Mongodb, and I'm designing an e-commerce application that has the model "Product", and you can see it here http://pastebin.com/YV2er5cY
[19:00:14] <badaroo> But I think this is so smelly... is this the best way to declare it?
[19:01:27] <Joeskyyy> The mongo docs have a pretty good catalog example
[19:38:19] <proteneer> The positional $ operator identifies an element in an array field to update without explicitly specifying the position of the element in the array.
[19:38:27] <proteneer> I want to explicitly specify the position
[20:37:29] <Joeskyyy> Only issue you may run into would be rounding I'd estimate with using floating points.
[20:37:58] <Joeskyyy> So depending on how exact and precise you need it, just be cautioned of that.
[20:38:42] <juliengrenier> I found a bug (I think) related with indexes by upgrading from 2.5.4 to 2.5.5
[20:39:54] <juliengrenier> I am now getting "exception: assertion src/mongo/db/exec/projection_exec.cpp:279" on distinct queries which used a sub-document index
[20:40:00] <badaroo> I see, gotta store it in cents...
[20:40:20] <Joeskyyy> That'd be one way to do it as well.
[21:44:41] <hephaestus_rg> f i have 1500 documents that embed a total of 100,000 documents, will i see negative impacts from adding an index on the embedded documents?
[21:53:26] <groundup> I am trying to get all of the duplicate slugs: http://pastebin.com/Bz1BRWkt
[21:57:20] <groundup> Using the PHP MongoDB driver. Every document has a slug, unless this match is wrong too: array('slug' => array('$exists' => false))
[22:03:53] <Derick> with '$_id' ... just like you did with '$slug'
[22:04:32] <freeone3000> When specifying a shard as a replica set member, I should do it by hostname. Understood. But I don't specify primary in a replica set. How do I make sure that I specify the correct member?
[22:06:10] <freeone3000> joannac: I want my mongos instance to route requests to shards matching the proper key range to primary for the replicaset. How do I make sure I get primary, if I'm not supposed to specify it?
[22:06:30] <freeone3000> joannac: Or am I just supposed to put in a random server in the set and hope that things work?
[22:07:06] <groundup> If I add `'_id' => '$_id'` it groups every document in there.
[22:10:38] <joannac> freeone3000: your shards should specify all nodes in the set, and the mongoS should route to primary unless you change your read preference
[22:11:46] <freeone3000> joannac: Okay, thanks. I may be doing something slightly odd - I'm using sharding to choose the DB that "should" hold a user's data, to make sure that they get short response times.
[22:12:07] <freeone3000> joannac: Does this mean that every instance of every replica set must be reachable by every mongos?
[22:13:59] <freeone3000> Thanks. Time for some fun-with-PAT-and-international-datacenters.
[22:14:47] <freeone3000> One further. Is it safe to have different config servers for each mongos in a shard set, as long as the servers each have the same hostname and listen on the same port, and contain the same info?
[22:15:27] <freeone3000> joannac: Okay. But from what I've been reading, config servers need to come to agreement. How much latency is involved in building consensus between config servers?
[22:15:57] <joannac> nothing will finish unless all 3 config servers are in consensus
[22:16:13] <freeone3000> joannac: Right. Which means that the config servers should be located close to the mongos instance.
[22:16:27] <freeone3000> joannac: But if I have a mongos instance in New York and another in Singapore, where do I place the config servers?
[22:16:50] <joannac> yes and no. mongoDs need to communicate to it as well
[22:17:07] <leifw> freeone3000: the config servers do a form of 2PC but they don't store very much data and they are updated pretty infrequently. The shard servers also need to contact the config servers so don't only think about the mongos routers
[22:17:25] <freeone3000> leifw: Yeah, but they're queried often.
[22:17:32] <joannac> freeone3000: whereever gets you the least latency
[22:17:59] <freeone3000> joannac: Which would be "in the same datacenter". So why not have two copies of the three-server conf cluster that have the same hostname?
[22:18:08] <leifw> Everyone in the cluster caches the config info heavily
[22:18:18] <joannac> freeone3000: how are you going to keep them in sync?
[22:18:42] <freeone3000> joannac: Ah. THe mongod writes, okay.
[22:19:55] <joannac> freeone3000: I'm not sure what your usecase is, but I'm worried
[22:20:15] <joannac> espcially when you want multiple sets of config servers and you're sharding to direct users to "the correct database"
[22:20:39] <freeone3000> joannac: Trying to reduce app server -> mongos -> mongod latency.
[22:21:10] <freeone3000> joannac: Which involves either storing a complete copy of the entire database everywhere there's an app server, or it means splitting up the data and still somehow making it accessible in case the user moves.
[22:21:23] <freeone3000> joannac: Sharding provided an obvious path to option 2.
[22:26:16] <joannac> How does sharding get you option 2?
[22:27:13] <harttho> anyone know how to handle int64 in node/mongo
[22:27:23] <harttho> I'm trying to use them as an $inc field, but am getting a 'for numbers only' error
[22:27:32] <Derick> harttho: not sure whether that is actually possible
[22:28:43] <Derick> https://github.com/broofa/node-int64 points to that it is not possible without hacks
[22:28:45] <harttho> mongo may not allow $inc for special typed numbers?
[22:29:04] <Derick> harttho: it's not mongodb that prevents this, it's the javascript engine that powers node
[22:29:45] <freeone3000> joannac: Because each query has info about where the user originally was, so it allows me to route data based on that to the proper database. You put the American users on an American shard, put the Singaporean users on a Singapore shard, then when queries hit the router, they're directed by mongos where they should go.
[22:29:52] <joannac> hey Derick -- are you abck home?
[22:30:23] <freeone3000> joannac: If the user happens to be in America at the time, even though they were from Singapore, they're still routed to the singapore database, so we still have their data.
[22:31:08] <Derick> harttho: I meant: https://groups.google.com/d/msg/nodejs/TggGyJvgIw0/LxbV2Oe02DsJ
[22:32:43] <freeone3000> joannac: And yeah, we likely could do a replica set instead with a ton of non-voting secondaries, but we also need the other benefits of sharding - taking some load off the primary for writes and putting the data on more than one server.