[00:11:19] <greyTEO> GothAlice: any chance development will continue on https://github.com/marrow/task ?
[00:12:35] <GothAlice> greyTEO: Indeed; I'm currently polishing up the cinje template/DSL engine first (ugh, marketing sites are tedious to work on if you're a lib developer ;) then can turn my attention to divorcing task from MongoEngine prior to bundling up a release of that.
[00:12:55] <GothAlice> MongoEngine has… broken my heart a bit. :/
[00:14:03] <greyTEO> awesome! im currently re-implementing a queue from a while back and do not marrow.task avalible on pip anymore..
[00:14:46] <greyTEO> not sure what version I was way using way back in the day
[00:15:04] <GothAlice> m.task has never been available on the package index. :/
[00:15:29] <GothAlice> greyTEO: https://github.com/marrow/mongo/blob/develop/marrow/mongo/util/capped.py may be of interest.
[00:15:58] <GothAlice> marrow.mongo is going to be my collection of pymongo add-ons, vs. the MongoEngine approach of wrapping everything.
[00:16:48] <GothAlice> And, ah, see the documentation for Python's "warnings" module to disable that "A batgirl has died." warning if you try to patch in production environments. ;P
[00:19:02] <greyTEO> ok, I thought it was. maybe it’s a documentation thing
[00:19:12] <greyTEO> Ill give that repo a look and see what’s in there.
[00:19:28] <greyTEO> it is an interesting warning though…lol
[00:21:00] <GothAlice> "Monkeypatching" is generally frowned upon.
[00:21:34] <GothAlice> But the function was designed, in this iteration, to match the call semantics of the rest of the new pymongo API, i.e. designed to integrate as a method on Collection.
[01:46:25] <Forbidd3n> Quick question in regards to storing dataset in MongoDB? Would I have three pieces; Cruise Lines, Cruise Ships (linked to lines), Cruise Schedules (linked to ships) - Should I store this all in one collection or should I create three collections and link them by id?
[01:47:26] <Forbidd3n> I was thinking of one collection called Schedules (Line, Ship, Date, Port, Arrive, Depart)
[01:48:03] <Forbidd3n> The only issue with this is I will need to use Cruise Lines and Cruise Ships again and will have other tables using the same references
[01:51:53] <Forbidd3n> Or better yet I was thinking one collection called Schedules it will hold parent (lines -> object/array of ships per line -> object/array of schedules per ship
[08:19:40] <bo_> Hi All, in repica set primary and arbiter nodes can't see secondary node. Return "lastHeartbeatMessage" : "Couldn't get a connection within the time limit". Please help
[08:47:19] <bo_> in repica set primary and arbiter nodes can't see secondary node. Return "lastHeartbeatMessage" : "Couldn't get a connection within the time limit". Please help
[10:15:59] <r3d0x> hey there. I'm having an issue here with an update, concerning the positional operator. Does anyone here have an idea if this is a bug? Thanks http://pastebin.com/h2NGhN8Z
[10:18:57] <Derick> r3d0x: not sure whether it is a *bug* - perhaps not something we support. Does it work if you switch the query clauses for scopes and tags.id around?
[10:19:48] <r3d0x> Derick: i've already tried that and it didn't work
[11:48:59] <kurushiyama> aiRness Use LVM, create snapshots. Or use cloud managers backup
[11:49:33] <aiRness> hmm that's not an option atm unfortunately
[11:49:50] <aiRness> but I thought I can reach with mongodump the data outside the container
[11:49:55] <aiRness> from inside the container the dump works ok
[11:50:51] <bo_> <kurushiyama> but what is it ? Why mongod have weird behaviour when send HeartBeat?
[11:51:29] <kurushiyama> aiRness a) mongodump is the very least backup solution I use, aside from config servers in a sharded cluster b) The bigger your data gets, the more inconsistent your backup will become using mongodump. c) at least LVM should be possible. It is free and easy to use.
[11:51:59] <kurushiyama> bo_ It sends a heartbeat every few secs, iirc. That is not the cause of any problem.
[11:52:12] <aiRness> I understand, for now it should be fine though
[11:54:13] <bo_> I agree with you that heartbeat is not cause of problem, it's result of problem
[11:54:22] <kurushiyama> aiRness Then simply connect to your mongodb via the exposed port and do your dump. Again, doing a mongodump against a running instance is a horrible idea for backup purposes, if we are not talking of cheap data. It _is_ possible, but you really have to put some effort into data modelling to make sure you will get a consistent backup.
[11:54:44] <kurushiyama> bo_ Maybe you should start from the beginning, describe your env and the problems you experience.
[11:55:04] <aiRness> kurushiyama: alright, thanks but regarding connecting to the exposed port I get an empty dump, I can see the data only from inside the container
[11:55:10] <aiRness> when I do a mongodump that is
[12:06:34] <kurushiyama> run docker ps -a on the other nodes of the replset. I assume they expose the port.
[12:07:00] <aiRness> I run it on the primary, ok give me a min
[12:07:29] <kurushiyama> aiRness If it is not reachable, it is unlikely to be still primary, no matter your designation ;)
[12:08:30] <aiRness> the only issue at the momment is that mongodump is empty, it can see fine the host/port and I can connect to the mongo client
[12:08:40] <aiRness> all the instances have the same setup
[12:10:04] <kurushiyama> aiRness Well, since you seem to have it all right, I can not help you. Your magical docker env, which does not expose a port probably exchanges the data using quantum fluctuation, which is beyond my understanding.
[12:10:46] <aiRness> well I didn't claim that I have it all right, it's weird to me
[12:39:36] <kurushiyama> aiRness imvho, DevOps is just an expression of CXOs saying "I just have to c&p a few lines, and I have software X installed. How hard can that be?" Overlooking some tiny details.
[12:40:49] <dddh__> hidden replica set member with no indices > mongodump
[12:42:36] <kurushiyama> dddh__ The problem is when data gets big and the mongodump takes long. Hidden or not, you'd run into problems.
[12:43:37] <aiRness> kurushiyama: it will fade away like other stuff did, virtualization, agile, cloud way of doing things, etc.
[12:44:21] <aiRness> and it created evne more confusion
[12:44:34] <aiRness> nobody understands what it actually means, everyone is missinterpreting each other
[12:44:52] <kurushiyama> aiRness I doubt that. It "saves" money. Those things are going to persist. And I am not too sure agile fades. It is the reason for this devops nonsense the first place, no?
[12:49:17] <kurushiyama> Spritzgebaeck Guten Tag auch! ;)
[12:50:41] <Spritzgebaeck> i have a question. we had an application for a mongodb 2.6.11, today we upgrade to 3.2.1 and our aggregates with an $out command are slower than before. if we removed the $out command there perform nearly same
[12:56:47] <Spritzgebaeck> okay, i had an disconnect, so i have to ask the same question again
[12:56:49] <Spritzgebaeck> i have a question. we had an application for a mongodb 2.6.11, today we upgrade to 3.2.1 and our aggregates with an $out command are slower than before. if we removed the $out command there perform nearly same
[12:58:41] <kurushiyama> Spritzgebaeck Well, sort of. Do you have a replset?
[12:59:10] <scruz> can’t do a $group within a $group, more’s the pity
[13:00:23] <scruz> i think i’ll $push 1 or 0 to an array then sum outside mongodb
[13:00:51] <kurushiyama> scruz How about pastebinning an input doc and the expected output?
[13:00:56] <Spritzgebaeck> kurushiyama: no, it is only one mongodb on a windows 8 system
[13:01:46] <kurushiyama> Spritzgebaeck Uhm. You are running a mongod on a windows 8 for production and performance is of concern?
[13:02:52] <kurushiyama> Spritzgebaeck Forgive me, but that is a bit like complaining that your VW Golf is getting slower when you load stones into it, all while trying to get it ready to compete in Formula 1.
[13:03:03] <Spritzgebaeck> it's a developer machine and after the upgrade our aggregates took longer.
[13:03:52] <kurushiyama> Spritzgebaeck Which is understandable. a) WT needs more RAM b) WT needs more CPU c) WT is more reliant on FS performance.
[13:05:38] <kurushiyama> Spritzgebaeck regarding a) Windows takes a lot of RAM by itself. b) Your CPU most likely did no change c) Both ReFS and NTFS aren't exactly known as performance monsters.
[13:08:18] <StephenLynx> i'd suggest using a VM to develop.
[13:08:31] <StephenLynx> running anything worthwhile on windows is masochism.
[13:08:46] <Spritzgebaeck> kurushiyama: okay. than we have an explanation for our problem. we was uncertain about that problem
[13:14:09] <kurushiyama> Spritzgebaeck Be warned. Running MongoDB on Windows forces you to prematurely scale, most likely because of IO or RAM bottlenecks artificially made worse because of the uderlying OS. We can easily talk of hundreds to thousand/month.
[13:16:06] <kurushiyama> Wow, running an IT infrastructure with no admins. That is the epitomy of devops.
[13:17:46] <StephenLynx> told you, his clients are beyond hope, probably.
[13:17:58] <kurushiyama> What's next? Getting rid of dev? Or sth like SalesMarkBackOfficeDev?
[13:18:00] <StephenLynx> the worst is that he is aware of it.
[13:24:41] <scruz> kurushiyama: i’m building the pipeline in python, so i get some extra tools to work with to create a monster of an aggregation pipeline
[13:31:41] <scruz> that’s from an interactive session for thinking through
[13:31:50] <scruz> now have to translate that into solid code.
[15:21:34] <deshymers> I'm looking at using mongodb to store analytic data, and I was wondering if this https://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/ is still relevant an current?
[15:22:01] <deshymers> I found a few blog posts but they were almost 6 years old
[15:22:29] <echelon> hi, what should i do if mongo runs out of memory?
[15:47:12] <echelon> "Nevertheless, systems running MongoDB do not need swap for routine operation. Database files are memory-mapped and should constitute most of your MongoDB memory use. Therefore, it is unlikely that mongod will ever use any swap space in normal operation. The operating system will release memory from the memory mapped files without needing swap and MongoDB can write data to the data files without needing the
[15:58:50] <kurushiyama> echelon In which case they'd have to be loaded from "disk"
[15:59:44] <deshymers> I am working on storing analytic data in mongodb, we dont need to be very granular only weekly and monthly counts, so I was thinking that this structure would be good for that, https://gist.github.com/deshymers/d86d1d579c32a133e47d2ec227b39315
[16:02:34] <deshymers> that makes sense, then the documents to get to bloated
[16:03:05] <kurushiyama> deshymers As for time series data, I'd always put it into a flat "schema" like {"date": someDate, "source":whatever, "value": foo}
[16:04:23] <deshymers> kurushiyama: so one document per entry then? and another for total counts?
[16:05:27] <kurushiyama> deshymers You do not need another one. You can calc the counts on demand or from time to time using aggregations, potentially writing those aggregations to a collection using an $out stage
[16:05:38] <deshymers> this is what I was following, https://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/#schema
[16:09:52] <kurushiyama> deshymers If you need real time data (in the strictest meaning of the word) however, it might not be optimal.
[16:10:21] <deshymers> oh no this isnt for realtime, we only want to graph weekly and monthly views
[16:10:44] <kurushiyama> deshymers Then I'd go with flat events, plus aggregations with an $out stage
[16:10:52] <deshymers> I know this talks about down to the second, http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb and that is way over kill for my needs :P
[16:11:40] <deshymers> kurushiyama: and filtering on a date index by weekly wont be a big performance hit?
[16:12:38] <deshymers> I guess I could still store the data as an int if I need to
[16:12:46] <kurushiyama> deshymers depends on what we are talking about. But in general we are talking of a DBMS. It would not really stand up to the name if it was unable to answer questions.
[16:12:47] <deshymers> err store the week as an int
[16:14:14] <kurushiyama> deshymers Let me put it that way: I do something similar regularly for testing purposes on 10M records. On my 6 year old MBA...
[16:14:43] <cheeser> putting that business degree to work!
[16:16:47] <kurushiyama> deshymers You are welcome.
[16:22:14] <deshymers> so another question, the ObjectID contains a timestamp, is that indexable? or would I be better of having a separate timestamp to query against?
[16:25:36] <kurushiyama> deshymers short answer: Use a separate timestamp. Long answer: There are some rare use cases in which ObjectId is sufficient since it is monotonically increasing. But it may get really complicated with client side generated ObjectIds and alike. All that for a saved index over a 64bit int? I dont think so.
[16:34:27] <kurushiyama> Derick Well, since my c++ skills are around 0.1 on a scale of 1 - 100, I'd probably accidentally transcend MongoDB into the Flying Spaghetti Monster. ;)
[18:21:24] <diegoaguilar> whats better to use in order to get counts
[18:24:54] <kurushiyama> StephenLynx there is a bug somewhere. Ran into it, and while researching on it, I found out that aggregations are not susceptible to this problem.
[18:29:03] <kurushiyama> diegoaguilar So personally, I would use an aggregation
[18:30:56] <uehtesham90_> hello, i had a question regarding using flask with mongoengine. i am adding a flask web app on an already existing project which consists of multiple databases. but in flask-mongoengine, we have to define a default database, so should we just provide one of the database names as the default database and internally access other databases through the mongo
[18:30:56] <uehtesham90_> client? i am not sure what the best approach is...would appreciate some help here :)
[18:32:03] <kurushiyama> uehtesham90_ Maybe you should ask on #flask and/or #mongoengine?
[18:33:22] <uehtesham90_> i am asking in #mongoengine but there was no response...so i thought i would give it a shot here
[18:34:16] <kurushiyama> uehtesham90_ Well, personally, I do not use either, so my help will be limited.
[18:34:18] <StephenLynx> have a system setting to indicate the database name
[18:36:22] <uehtesham90_> StephenLynx: there is a system settings but we can only define one database...but i have currently many databases....so i want sure if i should have a separate database instance referring to each database?
[18:37:02] <StephenLynx> and you really need multiple databases?
[18:38:31] <kurushiyama> uehtesham90_ From what I can see, the problems might actually lie elsewhere.
[18:39:54] <uehtesham90_> StephenLynx: this is how my project is setup. it is for analysing student data for different online courses. so we stored each course data in separate databases (each with their own collections)
[18:45:53] <StephenLynx> addendum: dynamic collection creation and dynamic field creation are awful too.
[18:46:04] <uehtesham90_> just one more question....based on your experience, when should a separate database be used? like what one should look for when considering using a new database
[18:46:05] <StephenLynx> if you can't document your model, you should change your model.
[18:47:30] <StephenLynx> you have to have this legacy database being used by your software and at the same time have a new database
[18:47:51] <StephenLynx> or you want to isolate the databases physically and it serves a different purpose
[18:48:22] <StephenLynx> usually a different database will serve a different system and you won't connect to the db directly and will instead communicate with the system that handles the second database.
[18:48:37] <StephenLynx> so tl,dr; only add a second database as a last resort.
[18:48:54] <StephenLynx> if there is absolutely no other option.
[18:50:20] <StephenLynx> that usually a different db is behind a different system.
[18:50:49] <StephenLynx> and that usually you shouldn't.
[18:50:51] <kurushiyama> StephenLynx I was not weakening your arguments. I was just differing on the conslusion ;)
[18:52:15] <uehtesham90_> thank you so much StephenLynx and kurushiyama
[18:52:27] <kurushiyama> Imho, there is absolutely no reason to have two databases even when dealing with legacy systems. either migrate or integrate. ;)
[18:53:06] <StephenLynx> you only do that under extremely contrived circumstances.
[18:53:38] <StephenLynx> and any other option means throwing everything that depends on the software to be trashed, including companies.
[18:54:43] <dman777_alter> with mongo node driver insertOne(doc, options, callback){Promise}
[18:54:46] <kurushiyama> StephenLynx Whereas it remains to be proven that keeping multiple databases make more sense than doing a migration. No offense?
[18:57:11] <StephenLynx> when I meant last resort, I really meant last resot.
[18:57:35] <kurushiyama> uehtesham90_ Well... no, not really. Assume you have a legacy system in need of the old database, and you have a new system with some other. Integration would mean that you make those applications communicate with each other to exchange data, rather than to have them communicate by sharing data
[18:58:50] <kurushiyama> uehtesham90_ what you described would be a data migration. ETL, basically.
[19:01:19] <kurushiyama> uehtesham90_ The problem with communication via data sharing is that two applications become dependent on the same database, and changes in your data structure become really, really hard to be executed without affecting the other system
[19:28:27] <Kaetemi> getting this exception on the target system when doing rs.add, any idea http://pastebin.com/BYFN9fpm ?
[19:33:36] <FuzzySockets> When you create an index with mongoose, what do the 1 or -1 represent... the sort order?
[23:20:40] <dimaj> hello all! have a weird question today... is there a way to access a mongodb through another mongodb? My setup is as follows: Target MongoDB (Network A) <--- Proxy MongoDB (Network B) <---- Application (Network B). I need to access Target MongoDB from my application