[00:14:19] <morenoh151> I'm getting `"ObjectId" used outside of binding context. (block-scoped-var)` lint warning. Do I require('mongodb') in this file?
[00:24:33] <morenoh151> oh cool mongoose has findbyId 👍
[00:54:41] <katfang> Hi! I'm migrating my java mongo driver from 2.11 to 2.13. I figured out how to migrate the credentials/auth, but is there a way to check the authentication before performing operations?
[01:35:14] <cheeser> katfang: you have to perform an operation against the DB first. could be as simple as listing the collections.
[01:36:15] <katfang> cheeser: Thanks. Is there really no other way? Or is there a really "cheap" operation? (I have a lot of collections)
[02:51:20] <dlewis> I've got a quick query question, how do I find all documents in a collection where the attribute of my condition is in an array? ex: https://gist.github.com/tetsuharu/25da90501e40fac2e16b
[05:06:59] <arussel> is there a way to set multiple fields, but one of them only if it is not already set ?
[05:32:11] <Boomtime> arussel: I don't think you can do what you want in a single operation (use two update operations using $exists as a predicate)
[05:32:24] <Boomtime> you may be insterested in this feature request: https://jira.mongodb.org/browse/SERVER-6566
[05:33:30] <Boomtime> alternatively, if you have a reasonable minimum bound you may be able to make use of $min
[05:37:56] <arussel> Boomtime: thanks, SERVER-6566 would be nice. I don't have a bound, those are strings
[09:16:01] <oznt> i have a replica set with mongodb 2.6.4 is it a bad idea adding a new member with >2.6.4 ?
[09:28:45] <josias> hi, i have a colletion with 8.361.936 documents and whant to aggregate 1.381.740 of them. The matching costs 50ms but with the aggregate it needs 2888ms why is it so slow? the command is db.collection.aggregate([ { "$match" : { "$and" : [ { "id.site" : 1 }, { "id.date" : { "$gte" : ISODate("2015-01-25T23:00:00.000+0000") } }, { "id.date" : { "$lte" : ISODate("2015-02-03T22:59:00.000+0000") } } ] } } ,{ "$group" : { "_id" : 1, pi : {
[09:30:51] <josias> it gets worse the more i add some properties to agregate (5224ms if i aggregate Year, Month & Day)
[09:43:56] <stark_> Hello Friends.....I am facing problem with mongo start
[09:44:16] <stark_> To run mongo every time I need to first run this command - sudo mongod --dbpath=/var/lib/mongodb
[09:44:43] <stark_> In next terminal I can use mongo
[09:45:03] <stark_> If I close the previous instance running sudo mongod --dbpath=/var/lib/mongodb
[12:10:20] <kali> i mean, if you plan on having 1M documents in a collection, and 1000 events/s to be recorded with $push, then yeah, you are in trouble
[12:21:48] <dsirijus> kali: i'm on mongoose. $push would mean inserting new document or replacing the old one in entirety?
[12:30:47] <StephenLynx> no need to have a fixed size array
[12:31:08] <StephenLynx> just don't use nested arrays for things that you expect to grow forever, dsirijus
[12:31:16] <StephenLynx> because eventually you will hit the 16mb limit.
[12:32:03] <StephenLynx> In my project I changed how posts in a thread are stored and how bans in a forum are stored too.
[12:32:27] <StephenLynx> because if I kept them on a sub array, a thread would have a post limit and a forum would have a limited number of active bans
[13:07:00] <dsirijus> hey, um, another question - does mongo keep all its databases and all its collections in memory?
[13:07:57] <dsirijus> (thinking about splitting the db into two, one needed for actual application running, and other for sort of internal logging thingie)
[13:10:31] <dsirijus> hm, maybe i should pick another solution for these purposes
[14:04:25] <StephenLynx> does the driver for node supports io when building?
[14:04:40] <StephenLynx> I know it can't build the native BSON module for pre versions
[14:04:48] <StephenLynx> unless you specifiy the install folder
[15:32:25] <GothAlice> StephenLynx: In my own forums software I resolved the 16MB limit differently. First, I evaluated the existing (phpBB-based) dataset, and discovered that all threads and all replies to all threads would easily fit within a single document (so not really a problem), but while still embedding replies into a thread, if the thread grows too large it automatically creates a new "continuation" thread. 16MB of text, though, amounts to ~3.3 million
[15:33:03] <GothAlice> For anything that needs to grow very rapidly, and where I can offload "processing" of the data that is streaming in, I use a capped collection. These are very efficient to insert records into, being a ring buffer.
[15:33:12] <StephenLynx> GothAlice a valid approach, too "hackish" for me, though.
[15:33:27] <GothAlice> With 14,000 users, we have yet to encounter a continuation. ;)
[15:33:46] <GothAlice> (The average response size is 30 or so words.)
[15:34:16] <StephenLynx> it may not be put in use, but all the code and rules are still there.
[15:34:29] <GothAlice> With that average reply size each thread could store ~111,000 replies before requiring a continuation.
[15:35:45] <StephenLynx> another problem I found out is how limited it is to work with embedded arrays in mongo.
[15:37:04] <StephenLynx> and how you have often to perform additional queries or unwind arrays.
[15:37:05] <GothAlice> https://github.com/bravecollective/forums/blob/develop/brave/forums/component/thread/model.py#L99-L195 < these are the thread reply management functions in the original fork of my forums. Working with complex nested structures becomes easier when you realize you can give the embedded documents their own IDs. :D
[15:46:42] <StephenLynx> if you unwind an empty array you end up with nothing.
[15:47:03] <StephenLynx> and even with that, unwinding and doing all that impacts performance and ram usage.
[15:47:23] <StephenLynx> valid, of course. but it comes with costs.
[15:48:15] <GothAlice> Basically what I'm saying, having written forum software that works quite effectively at some truly ludicrous scale and doesn't seem to encounter the problems you are describing, at all, is that you seem to be making the problem harder on yourself. Re-normalize your data to simplify the queries (apparent "data duplication" may be beneficial simply to make querying easier) and accept that certain techniques aren't hacks, like the idea of
[15:48:41] <GothAlice> Notably they aren't really a hack because you'll realistically never need to use it. (3.3 million words per thread…)
[15:49:04] <StephenLynx> I do have data duplication.
[15:49:58] <StephenLynx> and basically what I'm saying, you are more limited when working with sub-arrays because mongo simply can't do some stuff with sub-arrays.
[15:50:09] <StephenLynx> if you are ok with these restraints, then its fine.
[15:50:09] <GothAlice> Right tool for the job and all.
[15:50:31] <GothAlice> Relational data restricts you to the data design of a spreadsheet. That's more restrictive, IMHO.
[15:50:33] <StephenLynx> if you are ok with work arounds for limits that you won't reach is fine too.
[15:51:17] <StephenLynx> restrictive in what sense? performance?
[15:51:33] <GothAlice> How you are forced to think about and structure your data.
[15:52:15] <StephenLynx> how are you not forced to follow a structure for your data when you use sub-arrays?
[15:52:38] <StephenLynx> you still has to be consistent, no matter the approach.
[15:52:43] <GothAlice> Nothing is forcing you to take that approach.
[15:53:09] <GothAlice> Saw one development team give a presentation on their Awesome Django App™ at a conference once, and spent the entire presentation with my head in my hands in disbelief. They had a relational table with five of each of the basic datatypes (str1, str2, …, int1, int2, …) and an internal mapping for different record types. Instead of, I dunno, using a schemaless database that can handle inheritance patterns better.
[15:53:40] <StephenLynx> I can't see the relation with what I am saying here.
[15:53:42] <GothAlice> Entity Attribute Value was invented to have a relational database act like a document one.
[15:55:21] <GothAlice> StephenLynx: I don't quite grok your complaints. Yes, no matter what you pick you will have to understand the tool to best use it. This shouldn't be a surprise or point of contention. I've demonstrated forums that don't exhibit the problems you describe at all as a counter-point to your list of "problems" and "hack solutions". Practicality beats purity!
[15:56:17] <StephenLynx> In my defense, I never said the limitation of mongo were absolutely a problem. I said that for me, I rather not use sub-arrays for the sake of purity.
[15:56:17] <GothAlice> (And pre-aggregation of statistics is one highly useful way to avoid needing complex aggregate queries when the time comes to display those stats.)
[15:56:43] <GothAlice> StephenLynx: So you then introduce race conditions due to treating MongoDB like a relational database and faking joins client-side.
[15:56:45] <StephenLynx> I also pre-aggregate some stuff, like post count.
[15:56:55] <GothAlice> So, certainly a trade-off, and not nearly as "pure" as you might think.
[15:57:18] <StephenLynx> when I query for posts I just look for all posts that matches both the forum and the thread.
[15:57:50] <StephenLynx> I don't look for pieces of data in another collection and add it to my data.
[15:57:53] <GothAlice> "Give me thread X and all replies to thread X." — fake join requiring a findOne for the thread, and subsequent separate find for the replies.
[15:58:21] <StephenLynx> it is a simple where x = y
[15:59:01] <GothAlice> Yup, likely using IDs as a _fake join_. Here's an example of something: when you delete a thread, do you keep the replies around?
[15:59:19] <StephenLynx> no, I delete all posts that matches said forum and thread.
[16:00:03] <GothAlice> So there exists a time at which replies exist, but the linkage to the thread is broken because the thread has been deleted, but the replies haven't been yet. (Or the reverse, depending on what order you do things.)
[16:00:14] <GothAlice> http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html may be a worthwhile read.
[16:02:13] <jenner> guys, is there a way to fix "Document Error: key for index {...} too long for document: {...}" without deleting the document as suggested in the upgrade faq?
[16:02:13] <GothAlice> Admittedly, I, too, do the "fake join" thing at the Forum -> Thread level. Threads are their own collection (nesting replies within them), but individual Forums are, indeed, their own thing. Double nesting has even greater restrictions (notably on $elemMatch), so I don't go that far.
[16:02:57] <StephenLynx> one thing that caused me to move threads to a different collection what pinned threads
[16:03:29] <GothAlice> https://github.com/bravecollective/forums/blob/develop/brave/forums/component/thread/model.py#L47-L50 < I support several flags.
[16:03:38] <StephenLynx> that required sorting by a condition and sorting by time
[16:03:58] <StephenLynx> and how do you sort pinned threads?
[16:04:20] <GothAlice> I do two queries. Step 1: find all sticky threads in the current forum. Step 2: find everything else. That's the "pragmatic" approach to the problem, rather than creating a convoluted query to do it all in one.
[16:04:25] <StephenLynx> oh, you also have a separate collection for threads,right?
[16:04:57] <StephenLynx> I didnt said I didn't supported it, I said it caused me to move to a separate collection
[16:05:05] <GothAlice> Threads is the primary collection of the entire app. 99% of the data exists there.
[16:05:39] <GothAlice> Maybe two or three dozen forums organized into a simple hierarchy, and some user mapping stuff. (These forums use an external authentication/authorization provider.)
[16:06:19] <StephenLynx> yeah, your threads are pretty much like mine.
[16:06:36] <StephenLynx> except I keep posts separate because I didn't wanted to deal with the 16mb limits on threads :v
[16:07:02] <GothAlice> Except that until your users write 3.3 million words within a thread, you won't have to deal with the 16mb limit on documents.
[16:07:51] <Nilium> I hate dealing with the 16mb limit because I wouldn't be hitting it if people before me hadn't made awful decisions on DB structure ಠ_ŕ˛
[16:08:04] <GothAlice> And creating a new thread, cloning the author and first post (which is itself a reply to the overall thread), adding a reply with a link to the original thread pointing at the new one then locking the old thread isn't exactly difficult for if you ever do need to deal with continuations.
[16:08:23] <Nilium> Nothing says fun like ~12,000 collections per DB.
[16:08:40] <GothAlice> Nilium: Indeed. MongoDB really requires you to sit down and think about your data model.
[16:08:59] <StephenLynx> by default I keep stuff on the same documents.
[16:09:02] <GothAlice> Nilium: Run into the namespace limits yet? :D
[16:09:17] <Nilium> Haven't _noticed_ that yet but I wouldn't be surprised if it's just that I'm not noticing it.
[16:09:47] <GothAlice> Well, index creation would explode and eventually you wouldn't be able to create new collections.
[16:09:58] <Nilium> Hm, probably haven't seen that yet.
[16:10:23] <StephenLynx> so your system creates collections dynamically?
[16:10:26] <GothAlice> http://docs.mongodb.org/manual/reference/limits/#namespaces < ah, you should be good until you get to ~24,000 namespaces (collections + indexes).
[16:17:23] <Nilium> StephenLynx: Yep. Not my idea.
[16:17:48] <StephenLynx> obviously. I don't expect someone who does that kind of thing to be able to use IRC
[16:18:26] <Nilium> I keep trying to get time to just burn it all to the ground and rebuild it in some manner that's actually optimal, but it's tricky
[16:18:49] <Nilium> Also hard because there are about 48 databases like this and it just makes me want to cry.
[16:19:08] <GothAlice> Clearly you need to mandate some courseware on your co-workers.
[16:19:20] <GothAlice> At my place of work we have silly hats.
[16:19:39] <Nilium> And I'm not even that familiar with MongoDB, I just bothered to read a tiny bit and then looked at our DBs and couldn't figure out how it got to be this way
[16:20:32] <Nilium> The closest I get to silly hats is wearing my Tool hat, 'cause having "Tool" across my forehead all day works wonders.
[16:20:42] <GothAlice> If you're about to do something in production (deploy, mangle data, etc.) you must wear the outrageous hat of silliness. We also have a data design bowler hat. These tend to draw attention of "backup" developers to watch over the shoulder and, in theory, prevent issues.
[16:21:43] <Nilium> That would be somewhat useful here.
[16:22:19] <GothAlice> We also have continuous integration server running that projects captioned pony GIFs on the wall depending on the outcome of a test run.
[16:22:36] <GothAlice> (Also shaming the developer who introduced the fail.)
[16:22:44] <Nilium> Right now we just require every new schema change (in RDBMS land) to go through live ops and our DB admins, and they at least catch a bunch of things there.
[16:23:02] <Nilium> Zero oversight for the Mongo side of things 'cause only a couple people know what's going on there.
[16:28:00] <GothAlice> brew install mongoldb; then I use https://gist.github.com/amcgregor/c33da0d76350f7018875
[16:28:03] <Nilium> It's my compromise to get both a usable shell and software I want.
[16:28:39] <GothAlice> I can't afford to pay myself to be my own sysadmin on Linux. I tried, but I wasted far too much of my time just trying to keep Linux stable as a desktop. (*shakes a fist at nVidia*)
[16:29:20] <GothAlice> dsirijus: That script constructs a local 3x3 sharded replica set with authentication enabled to allow my projects to test sharding keys in development.
[16:31:24] <GothAlice> dsirijus: As for per-project isolation, I generally develop with a single mongod process running via homebrew's normal mongodb launchd integration. They're generally only isolated by database on the same "host".
[16:31:30] <GothAlice> That script is mostly for the automated test suites to run.
[16:32:36] <dsirijus> eh, i should really run many VMs, but that's, even though pure, a PITA
[16:33:49] <Nilium> Vagrant's nice, SaltStack is nice aside from the documentation and setting it up.
[16:33:56] <GothAlice> So one assumes you have a heavily caching transparent proxy in place to ensure things aren't downloaded twice. ;)
[16:34:03] <Nilium> Once it's set up and I never have to look at it again, though, it's nice.
[16:34:45] <Nilium> I give SaltStack credit for coming up with an environment that makes PHP appear desirable though.
[16:35:24] <dsirijus> GothAlice: hm. good point. i should set up VPS while i'm at that too. but no time, eh.
[16:37:25] <GothAlice> Some days I feel coding is like: http://cl.ly/image/1f2f1Y1m0s3D1X0g0G2j Most of the time not so much, though.
[16:39:47] <dsirijus> heh, i'm actually lovin' it these days. but no time to do it all! especially by-the-book
[16:42:20] <dsirijus> mongo instance that's serving an infrequent loggin purpose (but with fair amount of data per document), shouldn't really require much RAM, right?
[16:42:26] <dsirijus> basically, just writing to it
[16:42:40] <dsirijus> (and it's not even _that_ much data per document)
[16:44:37] <GothAlice> dsirijus: MongoDB uses memory-mapped on-disk stripes. This means it'll always have lots of RAM "wired" to its memory space, but the kernel will decide how much to actually load. There will be a tendency for MongoDB to absorb as much free RAM as possible still, though, based on data set size.
[16:45:48] <GothAlice> My personal at-home dataset far exceeds available RAM—26 TiB—and is almost write-only. Queries are _incredibly_ slow the odd time I need to use them, though. (Not even my indexes fit in RAM, usually a desperate sign one needs to throw more hardware at the problem.)
[16:47:56] <dsirijus> yeah, none of these queries will be app-facing, only for on-demand manual queries
[16:48:40] <dsirijus> and i sure hope mu indexes will exceed RAM - it's invoices accounting :D
[16:49:53] <GothAlice> dsirijus: Y'know, I'd be quite curious as to the data model you're using to store invoice data. We're also doing this at work, and I like to compare notes on design. :)
[16:51:37] <dsirijus> GothAlice: well, one collection is actual invoices, but it's really mock-invoices since i'm not actually selling to users, platforms are (fb, apple, google, ...)
[16:52:38] <dsirijus> the other is records of purchases of virtual goods with virtual currency
[16:53:09] <dsirijus> GothAlice: so, i'm happy to have not need to go through what you will :)
[16:54:21] <dsirijus> though, dunno, i guess it would be nice to finally grok that aspect of paperwork by modelling it
[17:10:46] <GothAlice> ¬_¬ Yeah. The only loop structure defined in the core language is "forever".
[17:10:46] <StephenLynx> college quality project :v
[17:11:02] <StephenLynx> mine is some soft of storage tool
[17:11:24] <StephenLynx> when I get to think about it, it is sort of access
[17:11:46] <StephenLynx> the cherry of the cake is how I store stuff to XML
[17:12:00] <StephenLynx> I didn't serialized stuff, I just used some weird library that did that
[17:12:17] <StephenLynx> because my objects were incredibly complex, it would be really slow
[17:13:33] <GothAlice> https://gist.github.com/amcgregor/016098f96a687a6738a8 < somewhat more complete documentation. Mine is written on top of RPython and uses Pypy's JIT compiler, so it's pretty darned fast. OTOH, I'm working on making it do insane things, like full dependency graph generation for your code and automatic statement re-ordering for optimized parallelization.
[17:14:00] <GothAlice> (You don't do threads in Clueless, threads do you.)
[18:41:47] <theRoUS> Derick (or anyone, for that matter): i'm having a real problem with my document's "id" field (over the name of which i have no control) being conflated with and coerced to "_id".
[18:42:16] <theRoUS> i'm at a couple of removes (mongoid in rails) so i'm not sure where it's happening.
[18:42:41] <theRoUS> but the upshot is that "_id" is getting *my* value, and "id" is getting nil.
[18:42:56] <theRoUS> i haven't found anything online (yet) on how to avoid this
[19:07:48] <Sam___> theRoUS: what are you using for an ORM or mapping from your code to mongo documents?
[19:11:36] <Sam___> but there may be something in mongoid that you can set the primary key with. This will prevent the _id from getting the value. I had a simliar problem with mongoengine
[19:15:05] <theRoUS> Sam___: i'll check for that. thanks!
[21:15:12] <gabeio> hey so how does one (on linux) run the mongodb under a different user than the mongo user? I just don't know the setting name in the config file
[21:33:39] <dacuca> guys I have mongodb 2.2 with replica set and I want to upgrade to 3.0 (when it is out). What’s the best way to do it?
[21:35:30] <dacuca> I was thinking about adding new nodes to the replica with mongodb 2.4, then shutdown the old ones, and so on. is that ok?
[21:39:20] <gabeio> this is probably a better question than the previous one do any of you know of a better way to allow mongodb to write to an external usb on ubuntu without giving it root?
[21:40:33] <scellow> joannac, thanks ill take time to read that !
[21:40:53] <joannac> dacuca: sure, but that'll be slow
[21:41:11] <dacuca> gabeio: you can add the user to the group that has permission to write on the device
[21:41:22] <dacuca> joannac: but it is safer isn’t it?
[21:41:42] <joannac> dacuca: why not upgrade in place?
[21:42:17] <scellow> joannac, i have already read that guide, i have no problem to connect to my test local db, the problem is i can't connect to the remote server, the ports are open, mongod is correctly binded to localhost, the port is correct, but unable to connect
[21:42:20] <dacuca> I’m just afraid of data corruption or something
[21:42:55] <joannac> scellow: what? mongod is correctly b binded to localhost? how is that "correct"
[21:43:23] <gabeio> there is no way to change the permissions on the usb... it is set to root:root >.> I tried sudo chown root:other group it just freaks
[21:43:50] <dacuca> gabeio: try to mount the external usb as the user who runs mongodb
[21:44:09] <joannac> dacuca: okay, so take backup before you upgrade, like you should do anyway
[21:45:37] <scellow> joannac, i was talking about the mongodb.conf file, from what i have understoop , mongodb is correctlly listening to 127.0.0.1 and can accept connection from the port 27017, so i should be able to access the db ? im wrong ?
[21:45:40] <dacuca> if the plan is switching to wiredtiger probably the best plan is to backup -> jump from 2.2 to 3.0 -> restore backup, isn’t it? joannac
[21:46:20] <dacuca> since I read somewhere that downtime is not an option when switching to wiredtiger
[21:47:12] <gabeio> dacuca you are making me feel like a noob at this but I suck at file systems how would I go about mounting it under another user and keep it mounted there automatically on restarts? if it's a command I know I can add it to the rc.locals startup script
[21:49:26] <joannac> just do your upgrade plan if it makes you feel better
[21:49:54] <joannac> as long as your databases aren't massive it should only be a bit slower
[21:51:07] <scellow> joannac, but it doesn't answer to my problem, why then i can't connect to the server, if it listens to 127.0.0.1 and if the port 27017 is open ?
[21:51:25] <joannac> scellow: because you're not connecting on localhost, you're connecting remotely?
[21:51:29] <scellow> what would be wrong, (beside me :p)
[21:52:45] <dacuca> joannac: is there a plan to upgrade to wiredtiger (from mmap) without downtime then?
[21:53:38] <scellow> joannac, mongodb is on my server, i'm trying to connect to that server from my computer, mongoClient = new Mongo("myHost" , 27017); and i get a connection refused, i checked the port and it's open
[21:54:41] <joannac> scellow: do you understand the difference betwee "local" and "remote"?
[22:25:08] <jpfarias> so I’ve got a database that is quite large, it has 200GB of data in 2 collections and about 60 million objects (15 million in one collection and 45 million in the other), this database is running off of a single machine with 24GB of memory. My question is, would queries run faster if it was running sharded over X machines with 24/X GB of memory? like, 2 machines with 12GB of memory each? or 10 machines with 2.4 GB of memory e
[22:40:05] <Sam___> jpfarias: ....yes if you build the shard correctly and have the right shard key
[22:42:39] <Sam___> jpfarias: there are many advantages to sharding.....distributed queries...means faster repsonse time....also means you can scale
[22:42:54] <Sam___> bybb: what is the command you issued?
[22:42:58] <bybb> I get a not "running with --replSet" message
[22:43:19] <bybb> Sam__ I'm followinf this http://docs.mongodb.org/manual/reference/method/rs.reconfig/#rs.reconfig
[22:43:36] <bybb> using rs.conf() and rs.reconfig()
[22:43:39] <Sam___> bybb: do you have a replica set? sounds like you are in single mode
[22:43:52] <Sam___> rs. is only reserved for replica sets
[22:44:27] <Sam___> you need to start 3 instances (2 nodes and arbiter) then rs.initiate()
[22:44:34] <Sam___> then rs.add() to add teh other 2
[22:44:39] <Sam___> then you will be in a replica set
[22:44:58] <bybb> well I'm quiet lost, running with a replica set returns this "set name may not change"
[22:45:17] <Sam___> jpfarias: we can discuss the pros and cons of sharding if you would like....but if you want to scale you data set and make your queries faster then you will need to shard with mongo
[22:45:42] <Sam___> bybb: how did you start your mongod?
[22:45:48] <Sam___> command line or with init file?
[22:46:02] <jpfarias> Sam___: I see, I just didn’t find much information about when to shard or not
[22:46:17] <bybb> Sam__ for the replica set I use a conf file
[22:46:32] <jpfarias> or if it is better to have more smaller machines or fewer bigger machines
[22:46:41] <Sam___> jpfarias: it is different for everyone to be honest....resources, velocity of data, etc.