[01:00:42] <davidbanham> Hi all. I'm getting my butt kicked by setting up a replica set. I think I have tracked the problem to: "Received heartbeat from member with the same member ID as ourself: 0""
[01:00:45] <davidbanham> However I cannot find any information on mongodb member IDs.
[01:04:02] <clay__> I am kind of in a bind and was hoping someone more knowledgable then me could help :)
[01:04:42] <clay__> error loading initial database config information :: caused by :: Couldn't load a valid config for voipo.cdrtest after 3 attempts. Please try again.
[01:04:51] <clay__> thats the error I'm getting on my sharded cluster
[01:05:21] <clay__> I've done some research online and people ppoint to the config servers as a possible problem but I'm not sure what command to run to "repair" the config server
[01:17:44] <joannac> davidbanham: check rs.conf() on all members of the replica set
[01:18:20] <joannac> clay__: look at the logs before that, why can't it load the config? what's the error?
[01:18:34] <davidbanham> joannac: On the master, it was listing the other two nodes as unresponsive. Connectivity between all members is fine. I can mongo --host from on to another without problems.
[01:30:47] <davidbanham> It won't let you rs.initiate() unless you're a root perms user. It won't let you add a root user without a majority. It's a bit of a dance to start in non-rs configured mode, add the users, then restart in the rs configured mode, auth, etc etc.
[01:36:33] <clay__> then all i see is Assertion failed while processing query op for voipo.$cmd :: caused by :: 13282 error loading initial database config information :: caused by :: Couldn't load a valid config for DATABASE.COLLECTION after 3 attempts. Please try again.
[01:36:34] <joannac> clay__: all the config servers are up?
[01:53:17] <davidbanham> I think I'll just burn it down and start again with every machine knowing it's externally reachable hostname at an OS level. Thanks for helping me track down the issue, joannac
[01:53:49] <joannac> clay__: well, when did it break? are you out of disk, maybe?
[06:07:03] <jry123> hey guys, quick question -- i have documents with 15+ language translations which can be quite large (well under the 16mb document size though)... i only need to use the keys for the current user's language, so obviously would be electing to only return those fields.... is this strategy ok? or should i split the translations into multiple documents?
[06:08:13] <jry123> obviously the single document approach is way easier, i just wonder from a memory perspective, there will be millions of records in this collection (converting from an old MySQL DB)
[07:41:29] <gothos> Hello, anyone here familiar with casbah? I'm trying to extract data from an array that has some object in it, it looks like this: "perfData" : [ { "pi" : "3.1" , "minute" : "3"}, ... ]
[07:41:31] <gothos> I'm getting a MongoDBList but can't find a way to get the object within, any recommendations? :)
[09:07:22] <willemb> Hi. I need some advice about running a mongodb replication set across multiple datacentres.
[09:21:29] <sudomarize> If i'm storing categories in mongo, what's the best way to do this? e.g. for a category "House Cleaning", how should i store this?
[09:23:08] <willemb> I have 3 nodes in dc A and 2 in dc B. the B-side's priority is set to 0 so that they don't become master (the services that query this db are all in dc A). However, when dc A goes awol, it is pretty hard to make one of them a primary while there is no primary available.
[09:42:35] <sudomarize> why is this channel so dead?
[09:46:40] <Derick> sudomarize: everybody is checking their mails first on Monday morning - and, it's still sleep time in the US.
[09:47:38] <Derick> willemb: I'd make the DC A nodes have priority 2, and the ones in DC B prioity 1
[09:47:47] <Derick> willemb: so that they can become primary
[09:48:11] <Derick> willemb: but note, that if DC A goes fully down, the 2 (out of 5) in DC B can not see the majority, and hence not elect a primary
[09:53:26] <sudomarize> Derick, do you have any suggestions about the best way of setting up categories in my schema? i'm a bit stuck here
[09:53:52] <Derick> sudomarize: You need to provide a little more information
[09:54:06] <Derick> on what you need to store, on how you add the data, and how you need to query it
[09:57:08] <sudomarize> Derick: no worries. So i'm creating a marketplace that allows users to create tasks. In terms of the parameters of listing a task (for the lister), i want to provide most of the structure, i.e. there will be a set of predefined categories and subcategories (3 tiers). But I imagine there will still be significant varying, even between two tasks with the same category tree, so would implementing a tag: {}
[09:57:09] <sudomarize> object in my schema (and then doing a search for both categories and tags) be a good idea?
[09:58:17] <sudomarize> i'm also not sure what the best way of storing that value in my db is, i.e. should i store it just as the formatted string, e.g. "House Cleaning", or as a key-value pair, something like {category: {name: "house-cleaning", formatted: "House Cleaning"}}
[09:59:41] <sudomarize> i've looked on the mongo page for creating categories, which seems relatively complex compared to the SQL solutions which i'm used
[09:59:54] <KekSi> anyone had this problem before? 2015-04-20T09:57:14.038+0000 E - [mongosMain] error upgrading config database to v6 :: caused by :: could not load config version for upgrade :: caused by :: socket exception [CLOSED]
[10:00:25] <KekSi> when i'm trying to restart a router after upgrading docker
[10:01:03] <sudomarize> and finally i'm a little worried about how making changes to the category structure will affect searches, i.e. is there a way to make changes to my category tree without causing issues when i run queries?
[10:05:43] <sudomarize> Derick: also what way do you think is best of adding categories? should i add a large amount initially, or add them slowly after they are requested by users? basically i want to be able to manage which categories are able to be searched, so only very popular categories are shown
[10:07:12] <Derick> what would you do about the tiers?
[10:13:19] <sudomarize> Derick: in terms of having a dynamic set of category tiers, i'm still not sure. But generally speaking i want to manage the quality of the category that are available to users, to make search and listings really easy. there's a good guide on the mongo site about setting up product categories (inc a tierd ssystem), so with a few adjustments that can easily be modelled to my particular needs
[10:14:01] <Derick> so you're asking how to store a category with each post? I don't quite understand yet
[10:17:56] <sudomarize> Derick: basically i'm asking 1. what's the best way of storing categories in mongo (e.g. an easy json name and then a formatted name for HTML like {key: "housecleaning", formatted: "House Cleaning}), 2. is there a way to make just a category tree dynamic, that is to say, add, removing or moving categories up or down the tree and 3. Assuming i have a relatively strict category system in place, is the best way
[10:17:58] <sudomarize> of differentiating listings with the same categories using a tagging system?
[10:18:19] <sudomarize> Derick: hope that's not too vauge, still trying to figure a lot of this stuff out at the moment
[10:18:42] <Derick> 1 - store it as you'd show it. It means less things to store
[10:19:08] <dvass> how can I manage my collections and data from it?
[10:19:46] <Derick> 2 - storing path structures is not very simple to do. But I think you found the docs on doing that. But I am not sure what you mean by "just make a category tree dynamic"
[10:19:54] <Derick> and 3 - what do you need to differentiate on?
[10:23:34] <sudomarize> Derick: basically so that i can edit and move categories up and down the category tree, say from tier 3 to tier 2
[10:24:13] <Derick> sudomarize: that's for question 2, right?
[10:24:18] <sudomarize> Derick: RE 3: well if a particular user has expertise in a subset of a tier 3 category, then there's no way of users searching for tasks to be able to find it
[10:24:40] <Derick> sudomarize: i think you'll have to do most in your application for the tree management
[10:25:01] <Derick> sudomarize: oh - hmm, for 3 - yeah, tags will work I suppose.
[10:25:24] <Derick> but what if a task requires multiple categories?
[10:25:58] <sudomarize> Derick: say gardening was a tier 3 category, and there's a lister who has expertise in growing roses, currently there's no way for them to let users looking for gardeners know this (except in the description of the task, but users can only search with categories <- to make the experience cleaner)
[10:26:33] <sudomarize> Derick: what would an example of multiple categories be?
[10:27:22] <Derick> sudomarize: a task that requires both rose-tending and fence repair?
[10:27:25] <sudomarize> mutliple tags or multiple categories, because they would only be able to add multiple categories due to the nature of the category tree
[10:27:41] <sudomarize> Derick: ah right i see what you're saying
[10:28:04] <Derick> i'd probably make "growing rose" a tier 4 cat
[10:28:32] <sudomarize> Derick: this is a perfect example of where i think tags would be useful
[10:28:52] <sudomarize> Derick: otherwise i'd have to make n categories to fit every particular skill set out there
[10:29:29] <sudomarize> growing roses -> rare temperate roses -> a particular species of rose -> etc
[10:30:03] <sudomarize> i mean 3 tiers is a relatively arbitrary number of tiers to have, but i think it gives good depth while retaining a relatively easy UX
[10:30:50] <sudomarize> Derick: but that's also what i was talking about before with a dynamic tree. there are probably some instances where >3 tiers is required
[10:32:08] <sudomarize> is it just me, or is this quite a complex problem?
[10:32:24] <sudomarize> really having trouble getting my head around it
[10:46:07] <Derick> sudomarize: trees in a database are always tricky
[10:47:02] <Derick> (unless you use a graph database)
[10:57:15] <KekSi> right, apparently it was a docker problem (upgraded from 1.5 to 1.6) -- just in case someone else is having odd problems
[12:40:16] <deathanchor> so one of my secondaries fell out of sync beyond the replWindow, and is stuck in recovery with this in the logs: [rsBackgroundSync] replSet not trying to sync from MEMBER it is vetoed for 80 more seconds
[12:40:41] <deathanchor> no way to recover other than clearing the data dir and doing a fresh sync?
[12:42:44] <cheeser> unless you have a backup via, say, mms.
[12:42:55] <cheeser> you could restore a checkpoint and then sync from there.
[12:44:20] <cheeser> lots of pros/cons to any solution
[12:45:36] <sudomarize> cheeser: how would you got about it?
[12:45:57] <sudomarize> really out of my depth here
[13:06:41] <arussel> I have a replicat set without authorization/authentication. I'd like to introduce it, so: 1: db.createUser({user: "root", pwd: "foobar", roles: [{role: "userAdminAnyDatabase", db: "admin"}]})
[14:10:38] <greyTEO> Did you ever use cloud server images as backups? http://docs.mongodb.org/ecosystem/platforms/rackspace-cloud/#cloud-servers-images
[14:16:59] <GothAlice> greyTEO: None of my backup mechanisms require locking like that example. Interesting that it's an option, though. (Also: don't get me started on the pricing of this type of "backup".)
[14:18:21] <GothAlice> Backups should be a) as cheap as possible, b) as difficult to restore as possible. "Deep storage" is great. If it takes an hour to get anything out of it, one tries extra special hard to make sure one doesn't need to use it.
[14:26:29] <greyTEO> Yea they are going to wreck you in the cloud storage price for those backups.
[14:29:23] <GothAlice> I also get giggles every time I read through platform instructions that focus on single distributions. "Here's how you do everything you need to do on Rackspace… if you use Debian/Ubuntu."
[14:30:35] <greyTEO> Tis true about about the backups being hard to extract data. PITA is the best motivation
[14:31:37] <greyTEO> I thought it was wierd they have a section for Rakespace...on ubuntu. Shouldn't they just have a section for Ubuntu and stop recommending those specific solutions?
[14:31:40] <GothAlice> Alice's Law #105: "Do you have a backup?" means "I can't fix this." ;)
[14:32:20] <GothAlice> Yeah, I wish these sections only had details that actually were unique to that platform. (I.e. the backup thing.) Installation instructions are best left to _installation instructions_.
[14:32:46] <greyTEO> 1.) what happened? 2.) do you have a backup? 3.) ???? lawlz
[14:32:47] <GothAlice> (And there really didn't seem to be anything unique from an installation standpoint.)
[14:34:22] <greyTEO> LVM looks to be the most widley used option. Im not that familiar with it.
[14:35:10] <greyTEO> mongodump doesnt seem to be that scalable and doesnt cover indexes
[14:35:52] <GothAlice> Hmm; not sure where you'd get those impressions from.
[14:37:48] <GothAlice> You could always run multiple dumps in parallel, each dumping different collection sets, potentially from different replica secondaries. The latest version even incorporates parallelism directly into the restore tool, too.
[14:38:27] <greyTEO> I thought on restore you have to rebuild the indices?
[14:39:15] <GothAlice> And information about indexes is, in fact, backed up. "Backing up" an index doesn't actually make any sense. The location of the data in the restored stripes will differ from the original stripes, so the original index data is utterly useless anyway…
[14:41:11] <greyTEO> the data would be useless but it would cut back on the restore time. depending on the indices and amount of data though
[14:41:22] <GothAlice> In fact, it'd slow it down.
[14:41:31] <GothAlice> You'd be "restoring" bogus data you'd then have to re-iterate and correct anyway.
[14:41:54] <GothAlice> Thus there's literally no point in even attempting to do that; just rebuild the index either as you go, or once at the end, and be done with it. (Iterate once, not many times.)
[14:42:39] <greyTEO> well restoring from a dump would re-import. I am thinking more point in time snapshots.
[14:42:48] <GothAlice> You _can not snapshot indexes_.
[14:43:02] <GothAlice> Index data outside of the literal instance that is storing the at-rest data is _meaningless_.
[14:43:25] <greyTEO> if you snapshot the the instance, it would.
[14:44:49] <GothAlice> There are several types of snapshot; only a filesystem-level snapshot would benefit from preserving index data, and, well, has the added bonus of MongoDB being completely unaware. So yeah, that'd work. (It's also the least efficient method to do a backup.) mongodump with the --oplog option is also a point-in-time snapshot.
[14:45:08] <GothAlice> It's still a re-compacted dump, though, so storing index data would continue to be a waste here.
[14:46:35] <greyTEO> I was just reading the mongodump with oplog give PIT
[14:47:30] <GothAlice> Of course, in production systems "high availability" should be the only backup one needs. Have replication secondaries in-DC, outside the DC, at your office, at your home… so worse-case and you need to switch hosting providers or data centres? Spin up a node, let it replicate, and keep on keeping on. ;)
[14:51:23] <greyTEO> Filesystem might not be the most effecietn but wouldnt that be the best way to get a hot backup without interrupting performance...(given you could run it at your slowest time of traffic)
[14:51:41] <GothAlice> greyTEO: No, replication = best hot backup.
[14:52:15] <GothAlice> Replication typically has no "instantaneous load" (i.e. it's constantly streaming, so shouldn't have peaks in activity that slow down the rest of the system).
[14:55:20] <GothAlice> ^ Read through all of this.
[14:56:15] <GothAlice> Replication is how you get reliable service in MongoDB. Point in time snapshots / backups are not. (Always remember law #105. Backups are what happens when everything else is on fire.)
[14:56:29] <GothAlice> Actually, totally need to add that quip as a follow-up.
[14:57:29] <greyTEO> I appreciate the info. Im still learning the in's & out's of mongo
[14:58:25] <greyTEO> with mysql http://www.percona.com/ is the best way for hotbackps and PIT. restore is crazy easy and it's all Filesystem based.
[14:58:49] <greyTEO> I was applying the same logic but I guess it doesn't always apply
[14:59:06] <GothAlice> It can be. Unless AWS decides that if your EC2 instance needs to touch an EBS volume, lock both the instance and the volume. Which _has happened multiple times_ on that service. ;)
[14:59:18] <GothAlice> I.e. filesystem snapshots as a backup process can destroy your business.
[14:59:56] <GothAlice> Also MySQL. (In one of those failures I had to spend the 36 hours prior to getting my wisdom teeth extracted to reverse engineer the on-disk InnoDB format. Managed to recover all the data, amazingly.)
[15:01:22] <GothAlice> greyTEO: I switched to Postgres, and use WALe to stream the write ahead log, i.e. oplog, straight from the Postgres server to S3 and several other destinations. None of my Postgres servers had permanent storage, because like MongoDB, I had them configured to self-repair on startup.
[15:01:32] <GothAlice> These days I just use MongoDB. ;)
[15:02:54] <Vitium> I'm still seeing things in a relational way
[15:02:58] <digi604> Hi everybody… what is the easiest way to copy a db from a replica set with 2 servers to a 6 server cluster? db.copyDatabase seams not working on clusters…
[15:03:06] <Vitium> I'd create a table for the product and a table for the reviews
[15:03:08] <GothAlice> Vitium: http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html may be useful, if you're up on the formal terms. :)
[15:04:07] <digi604> GothAlice: the db is too big for that one...
[15:04:51] <GothAlice> digi604: You do realize mongodump/mongorestore aren't limited to only running locally? I.e. you can run it from a machine with enough scratch space.
[15:05:14] <GothAlice> (Or attach an EBS volume or equivalent, for the duration of the migration.)
[15:05:25] <digi604> GothAlice: ok that seams to be the way to go...
[15:05:38] <GothAlice> Won't be fast, but you only asked for easy. ;)
[15:05:55] <digi604> GothAlice: what is the hard way to do it?
[15:06:15] <digi604> no let me rephrase this: what is the fastest way to do it...
[15:06:35] <GothAlice> digi604: Introduce the machine with the data on it into the new cluster, and let it re-balance. That's the hard way. ;)
[15:07:02] <GothAlice> I don't know which would be faster. Balancing can be quite slow.
[15:07:24] <digi604> GothAlice: can i remove hosts afterwards?
[15:08:30] <GothAlice> digi604: http://docs.mongodb.org/manual/reference/replication/ and http://docs.mongodb.org/manual/administration/replica-sets/ for some reading.
[15:09:05] <digi604> so if i have 3 shards and 3 replicas…. and i add the 2 existing nodes to it as an additional shard i can later remove the 2 existing nodes (one complete shard)?
[15:09:10] <GothAlice> Vitium: The article I linked should be quite helpful in exploring your current model. In MongoDB, dependant data like the reviews can often be embedded within the "parent" document, since when getting the review you'd likely also be wanting a copy of the product data, and when you delete a product, cleaning up the reviews is a natural second step.
[15:09:16] <GothAlice> (Embedding means it's all handled in one step.)
[15:09:45] <GothAlice> digi604: Yes; though when you ask it to remove the node, you'll have to wait for it to shuffle the data off that node before shutting it down.
[15:10:00] <digi604> GothAlice: ok that helped… thnx
[15:10:20] <GothAlice> digi604: I'd still recommend the dump/restore approach. ;)
[15:10:41] <GothAlice> Fewer things can go wrong. ^_^
[15:10:42] <Vitium> GothAlice, You mean like an array?
[15:11:21] <GothAlice> Vitium: Exactly; your reviews would be an array of embedded documents. e.g. product = {_id: …, name: "Walkman", manufacturer: "Sony", reviews: [{author: "Alice", comment: "It's boss."}, …]}
[15:13:06] <GothAlice> When there are so many reviews on something that the record fills up, you need to have a contingency plan. In my forums, where I store replies to a thread within the thread itself, I handle this by automatically starting a new thread, then linking the old one to the new one.
[15:13:59] <GothAlice> Vitium: It's important to not go overboard with nesting things; you can only effectively query one array per document per query.
[15:15:23] <GothAlice> You _can_ have "references" (typically just the ID of the target record, or a DBRef) between collections (tables), but without joins any time you need to gather information from a different collection, you need to perform an additional query.
[15:15:40] <Vitium> Yeah that's what I was thinking
[15:15:43] <Vitium> Hard to see things in a new way
[15:17:01] <GothAlice> Any time you get the sensation of wanting a JOIN, smack yourself. MongoDB has no relational capability whatsoever, and modelling with such non-existent capability in mind will only lead to slow queries and mangled code. ;)
[15:17:30] <Vitium> I'd slap myself into a comma though
[15:17:32] <GothAlice> See also: http://docs.mongodb.org/manual/reference/sql-comparison/ and http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/
[15:20:01] <StephenLynx> What she said. Any time you need to do something that would require a join in sql, you are doing it wrong.
[15:20:30] <StephenLynx> I personally see (few) uses for fake relations, but don't you ever think of them as your first, second or third option.
[15:20:35] <GothAlice> "The Gods of Excel have poisoned your minds against your data, friends!"
[15:24:48] <Vitium> Aren't they basically showing a join with the One-To-Many example? http://blog.mongodb.org/post/87200945828/6-rules-of-thumb-for-mongodb-schema-design-part-1
[15:25:23] <Vitium> The array is just a bunch of ObjectIDs
[15:28:33] <GothAlice> Vitium: Indeed, that example is simulating a fake join at the application level.
[15:29:22] <GothAlice> However the full impact of this is difficult to gauge from that example: it implies a second complete round-trip, you currently can't get results back in arbitrary orders (i.e. give me records X, Y, and Z in the order [X, Y, Z]), etc.
[15:31:10] <GothAlice> Unfortunately, that article states "Each Part is a stand-alone document, so it’s easy to search them and update them independently." which would seem to imply that you lose this ability if you embed. You do not.
[15:31:32] <Vitium> I'm just concerned about the size
[15:31:59] <GothAlice> To the point that the entire gaming forums I was converting from phpBB to my MongoDB solution in Python could fit in a single record.
[15:32:24] <StephenLynx> yeah, unless you expect your subdocuments to be really, really, really, huge, size is not an issue.
[15:32:55] <Vitium> So 16MB is the max size for any document?
[15:37:54] <GothAlice> I suspect people read too much into these things.
[15:38:09] <GothAlice> "While mongodump is sufficient for small deployments, it is not appropriate for larger systems. mongodump exerts too much load to be a truly scalable solution. It is not an incremental approach, so it requires a complete dump at each snapshot point, which is resource-intensive." — All true.
[15:38:26] <GothAlice> Also, notably, all things that will vary from deployment to deployment and application to application in terms of acceptable load, etc.
[15:38:53] <greyTEO> lol they recommend filesystem snapshot
[15:38:58] <greyTEO> which is way more complicated
[15:45:13] <GothAlice> Manual invocation of runCommand is a good warning sign of something being done more "craftily" than it needs to be, though. Driver support is pretty solid for almost all commands, excluding some pretty low-level administrative ones.
[15:45:58] <Derick> GothAlice: for the new PHP//HHVM driver we're not implementing *any* helpers in the base extension
[15:46:05] <Derick> but leave it to a library on top
[15:46:08] <Left_Turn> ah ok ... I'll take the advice... also thanks Derick for all the links.. got a lot of reading to do:)
[15:47:03] <GothAlice> Derick: Well, that does seem to be PHP's modus operandi: expose raw C function calls to the application layer. Which C calls? ALL THE C CALLS!
[16:02:11] <juliofreitas> Hi! Each 15 minutes I get a file with a time series data. The data are structured this way: time (2015-03-27T20:00:01Z), state (string), owner (string). Which is the best way to store and make search with a great performance although my search will be (by day, choosing the interval 15 minutes, 30 minutes, 1 hour...)
[16:04:00] <cheeser> juliofreitas: this might help: http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb
[16:05:20] <juliofreitas> cheeser, fine fine thank you :D
[16:21:39] <ParisHolley> “driver is incompatible with this server version”, on 3.0.2 with latest 2.x of mongo native driver and mongoose
[16:22:17] <StephenLynx> have you tried without mongoose?
[16:22:32] <StephenLynx> I have just deployed a server and didn't had any issues with the driver.
[16:22:39] <ParisHolley> no, all my projects use mongoose, did a server migration and thinks just seemed to work find, accept one app
[16:22:57] <ParisHolley> seems to be happening out of no where
[16:24:02] <ParisHolley> and, it only happens on writes
[17:36:18] <StephenLynx> you are not in kansas anymore, dorothy.
[17:38:01] <StephenLynx> Ive been working with node (now io.js) since september 2014 I guess, using just the RE and mongo driver, I am glad to work with web for the first time in my life.
[17:38:38] <StephenLynx> just released one of my two projects that use a back-end https://gitlab.com/mrseth/bck_lynxhub
[17:39:02] <StephenLynx> not having to deal with bloat is a bliss.
[17:39:54] <StephenLynx> so yeah, I suggest you at least trying to not use a framework for once, Vitium
[17:40:05] <StephenLynx> also, migrate to io.js, is everything node is, but better and faster.
[17:40:39] <Vitium> StephenLynx, Using node.js without a framework?
[17:42:46] <Vitium> I'm learning this for a school project, I don't have enough time to code all that
[17:42:59] <snowcode> okay now there is a "Point must only contain numeric elements" error inserting my GeoJSON Data into mongodb. If I try to insert a simple coordinate point [[lon,lat]] it works
[17:43:00] <StephenLynx> the learning curve is higher for these frameworks.
[17:43:06] <snowcode> If I try to insert a polygon
[17:57:32] <christo_m> mac native, ill try it, thx
[18:13:48] <gswallow> were there deprecation notices that simple things like Mongo::Connection.new() would just *stop* working when other peoples' code requires mongo gems, and you released 2.0.0?
[18:16:10] <GothAlice> gswallow: There are still people in existence who don't pin version number ranges? O_o
[18:17:06] <GothAlice> The latest update broke a fair number of things in Python-land, but "pymongo<3" is really, really easy to define as your package's requirement.
[18:19:42] <gswallow> not my package, but were there deprecation warnings?
[18:20:01] <gswallow> because I'd have thought about it sooner if there were (unless I ignored them, which is equally likely)
[18:26:37] <arussel> is that true: if I don't add the keyFile attribute, then anyone can do anything (if they can reach the host)?
[18:28:25] <GothAlice> But! If you do open it up to connections from anywhere, be sure to either enable authentication (and use good passphrases) and/or set up strong firewall rules.
[18:28:49] <arussel> so 1. I add keyFile to each of my nodes. 2. add an admin user 3.add an 'app' user
[18:35:02] <StephenLynx> it only throws an error when you try to do something the user is not allowed.
[18:35:12] <StephenLynx> is not allowed to do unless authenticated*
[18:35:44] <StephenLynx> I could get it to connect and get references to collections fine with a failed authentication.
[18:37:41] <gnu_d> Hi, I'm researching in Sharding, I'd like to know given two mongo servers, let say the first will have the newest data, and the other the data from older than 6 months. Is ok to make a scheduled job that will search for documents older than x time and then relocate them in the other shard, or this is made automatically when configuring the shard, if so can you tell me how ?
[18:39:53] <cheeser> gnu_d: you might consider http://docs.mongodb.org/manual/core/tag-aware-sharding/
[19:20:04] <christo_m> StephenLynx: quick question, data design thing
[19:20:21] <christo_m> I have a concept of a feed or queue for users. is it better to have it be a nested subdocument?
[19:20:38] <christo_m> its a 1-1 relationship so having it separated out doesn't make much sense.. except for the fact that i dont want to dirty up the user code i have currently.
[19:20:47] <christo_m> id rather queues have their own endpoint in the API etc
[19:21:06] <StephenLynx> you don't have to dirty your user code.
[19:21:12] <StephenLynx> why would you need to do that?
[19:33:35] <christo_m> im working on something that has deadlines, i dont have time to muck about with configuration when something works out of the box
[19:33:47] <StephenLynx> yeah, but when you add dependencies that change your workflow
[19:33:50] <christo_m> i have a lot of work done in just 2 days with this setup
[19:33:57] <StephenLynx> you are increasing the learning curve.
[19:34:00] <christo_m> grunt is anothre tool in there that is amazing
[19:41:12] <StephenLynx> hey, I an elitist on rizon too :c
[19:42:47] <GothAlice> I'm biased, as a contributor/owner to/of several of the projects I'll mention, but MongoDB has replaced a multitude of other services (queues, caches, etc.) for me, I use an ODM called MongoEngine that supplies, application-side, many nice things like efficient fake joins using caching, triggers, reverse delete rules, etc, and build a multitude of packages (under the "marrow" org on Github) that depend on it.
[19:43:02] <GothAlice> MongoDB is fantastic. MongoEngine is great. Mongoose is… a bad teacher.
[19:43:22] <christo_m> well again, this is what came out of the box with the generator
[19:43:36] <christo_m> so when i hit the problems you're talking about, ill consider it, seems to work fine for me right now
[19:44:59] <StephenLynx> that is the problem with these kinds of solutions. is one big stuff. it is heavy and obscure and asks you to shape your problem around its solution.
[19:45:10] <GothAlice> christo_m: The link I provided counters many of the common points others may attempt to use to nay-say MongoDB and can help avoid falling into some of the same traps. :)
[19:46:25] <christo_m> GothAlice: ya a lot of the things mentioned there are over my head right now
[19:46:33] <christo_m> id like to have the problems theyre talking about, means i have users :D
[19:46:53] <christo_m> StephenLynx: except it isnt like that at all.. because the components are right there in front of you, there's nothing magical happening
[19:47:04] <christo_m> you can swap whatever you want
[20:03:49] <GothAlice> Now here's an interesting question: BSON itself has no particular restrictions on field names. (It's a combined Pascal+C string.) Most drivers seem to enforce no $ in field names. DBRef is a "complex" type (i.e. a compound of simpler types) that uses $ in its field names. Is it OK to create custom types like DBRef for your own use?
[20:03:53] <GothAlice> And notably, are there any examples of doing this?
[20:07:39] <StephenLynx> can you use $ on a field name in the terminal?
[20:07:52] <cheeser> StephenLynx: i don't think so
[20:08:03] <cheeser> GothAlice: there'd be no serverside support for it...
[20:08:22] <cheeser> but as far as bson goes, it'd just have to serialize to a bson document.
[20:08:56] <StephenLynx> so mongo enforced field names without $ as well?
[20:09:04] <GothAlice> That's also not exactly what I'm describing, StephenLynx. I don't wish to manually commit embedded documents with $ in their field names, I'm wanting to create a new BSON adapter (like DBRef) that self-encodes other types. Mostly looking for an example of extending pymongo for additional types.
[20:09:53] <GothAlice> StephenLynx: pymongo.errors.InvalidName: key '{key}' must not start with '$'
[20:09:59] <GothAlice> It's checked in the client driver.
[20:10:09] <StephenLynx> yes, but is it checked on the server itself?
[20:10:17] <StephenLynx> thats why I asked about the terminal.
[20:10:33] <GothAlice> The mongo shell is a client driver…
[20:11:20] <GothAlice> And as expected, it explodes when I attempt to use $ in a field name, at the client level. (src/mongo/shell/collection.js)
[20:12:38] <cheeser> right. fields can't have "." in the name either
[20:13:29] <GothAlice> cheeser: It's always dubious how much of the validation goes on server-side. "[object Object]" being a valid collection name always gives me pause to shake my head. ;)
[20:17:21] <GothAlice> Biggest problem I frequently see with Mongoose users: it effectively forces you to throw your hands up and treat all ObjectIds as plain strings. (I.e. the hex encoded binary form, taking up more than 2x as much space, and heaven help you if you mix real ObjectIds and string-form ones.)
[20:24:44] <mortal1> now , every time someone sends me code via hipchat, my eye twitches a little
[20:27:15] <christo_m> GothAlice: {"message":"Cast to ObjectId failed for value \"add\" at path \"_id\"","name":"CastError","type":"ObjectId","value":"add","path":"_id"} i dont even know what this is doing
[20:27:24] <christo_m> im not doing anything special.. just routed to an empty function with express.js
[20:27:38] <GothAlice> I have no idea what that means or is doing, either.
[20:29:08] <jhoff909> is there a recommended replica set config when running on linux in Azure? and where there are clients external to Azure?
[20:33:12] <StephenLynx> do you really have to use azure?
[20:33:29] <StephenLynx> I heard its pretty expensive.
[21:06:42] <snowcode> anyone know how to store a GEOJson Polygon into mongodb? It still say to me "Point must only contain numeric elements" but i want to store a polygon, not a point
[21:08:52] <sudomarize> I have Task collection, and each collection has put into a category, e.g. one Task may have a category "iPads" <- "Apple" <- "Hardware" <- "Repair" (where the categories are a mongo tree, each one with a "parent" key pointing to the id of it's category parent)
[21:09:59] <sudomarize> Would it make sense to put the category tree for a particular Task into a {categories: []} array in my Task document?
[21:28:23] <greyTEO> what is the deepest level an object should be nested? speaking in terms of best practices...
[21:33:36] <GothAlice> greyTEO: No more than one level of nesting, unless you are _extremely_ careful.
[21:34:01] <greyTEO> so "address.locale.postalCode" is frowned upon?
[23:54:55] <GothAlice> bros: Alas, you can't query the last element unless you know how many there are using standard queries. You can project to the last using http://docs.mongodb.org/manual/reference/operator/projection/slice/ and use http://docs.mongodb.org/manual/reference/operator/aggregation/unwind/ followed by an additional match in an aggregate query.