[02:11:24] <Troy> I have a list of guid's stored on a document and i want to say give me all records where a guid is part of that documents list of guids. Doable?
[02:23:06] <Troy> it's like the opposite of an $in
[05:34:21] <Aboba> Anyone wnat to point out what i'm doing wrong with this?
[09:38:11] <Ange7> hey all, from https://jira.mongodb.org/browse/DOCS-4999 how is it possible to set socket mongodb with 777 permissions by default ? thank you :)
[10:02:00] <Derick> Ange7: you can use --filePermissions arg as argument to mongodb, but it's also just a config option: https://docs.mongodb.org/manual/reference/program/mongod/#cmdoption--filePermissions
[10:03:10] <kurushiyama> Albeit more often than not, it turns out to be a bad idea to have your application running on the same machine as the mongod.
[10:04:12] <Derick> Ange7: and see https://docs.mongodb.org/manual/reference/configuration-options/#net-options for the YAML config file sttyle for it
[10:08:20] <kurushiyama> Derick: Hm. I wonder if it is noticeable, especially given the connection pool maintained by a lot of drivers which eliminates the handshake overhead a great deal by preallocation. But ofc you are right, and for maxing out performance it surely makes sense.
[10:10:47] <Derick> kurushiyama: it's not just the handshake that TCP does slower
[10:13:13] <kurushiyama> Derick: Now you have me. So you basically say that in case you have a mongos on your app-server, unix sockets should be the preferred communication, which basically means that this setup would be best practise performancewise?
[10:18:37] <Ange7> how is it possible, i have a collection with 608 documents, i try : db.myCollection.findOne({domain: 'www.google.fr'}) (in mongoShell with mongo 3.2.5) took 1/2 minutes... ? and i have an index on domain field.
[10:21:45] <Ange7> But after, it's instantaneous cause i think the result query are in cache ? :x
[10:23:47] <Derick> how much free memory do you have?
[10:42:16] <kurushiyama> Ange7: Well, is the server under load?
[11:16:48] <stava> I have a Project model which contains one ProjectType and one ProjectStatus model. ProjectType contains an Array of ObjectIds for allowed ProjectStatuses. Can I somehow create a constraint in Mongoose or MongoDB to validate that, when creating or updating a Project, that the provided ProjectStatus is allowed by ProjectType? Ideally it should not be possible to remove an allowed ProjectStatus
[11:16:49] <stava> from ProjectType if it is used by a Project as well.
[11:21:44] <kurushiyama> stava: Uhm, this sounds overnormalized.
[11:23:10] <stava> But if the business requirement is as such
[11:23:19] <stava> What do I do? pick an rdbms instead?
[11:24:12] <kurushiyama> stava: You do not have to.
[11:24:33] <kurushiyama> stava: But I doubt that there is a business requirement to overnormalize.
[11:25:19] <kurushiyama> stava: You are trying to put too much logic into the database, here. You do not want allow to remove a used status?
[11:25:26] <stava> kurushiyama: Of course not. The requirement is that projects should be modelled after project types, where you configure facts about the projects that inherit the type
[11:26:01] <kurushiyama> stava: you do not want to allow to remove statuses which are in use, right?
[11:26:18] <kurushiyama> stava: Say the status is "NEW"
[11:26:30] <stava> No,i think that should require "releasing" the status from use first
[11:27:44] <kurushiyama> stava: Either way: That is logic which is not put into MongoDB, but your application.
[11:29:00] <stava> I get that. When creating or updating a project, I must fetch the relevant project type and validate the provided status against it?
[11:29:27] <kurushiyama> stava: Well, as far as I can tell from your desciption, this sounds about correct.
[11:30:30] <stava> So a document structure (or "schema") like Project {type: ObjectId, status: ObjectId}, ProjectType {statuses: [ObjectId]} is fine?
[11:31:50] <stava> I mean it looks like a relational/normalized schema
[11:32:16] <kurushiyama> stava: I would not do it that way. But then, I would not use Mongoose. The problem is that Mongoose kind of suggests (depending on how you see it) that MongoDB is sort of a limited RDBMS or an object store. Both views are good ways to get you into real trouble when you try to scale.
[11:33:08] <stava> I'd really like to hear how u'd do it, if u have time and motivation to explain it to me :p What do you prefer instead of Mongoose? Using the mongodb package for node directly?
[11:34:20] <kurushiyama> stava: The thing is that with an RDBMS , you identify your entities, their properties and relations, derive the models from that and then bang your head against the wall to get your JOINS right.
[11:34:40] <kurushiyama> stava: Which is kind of what Mongoose forces you to do.
[11:34:58] <kurushiyama> stava: Or at least make it very easy.
[11:35:24] <kurushiyama> stava: The proper way of data modelling with NoSQL in general and MongoDB in particular reverses the process.
[11:36:28] <kurushiyama> stava: You identify your use cases first. Then, you identify the data you need for those use cases. Then, you model your data so that you can query said data required in the most efficient way.
[11:37:19] <stava> So that the documents in mongodb represent pretty much the data you are actually using? Composing and organizing the data in the right way should be done on the application side when doing create/update operations?
[11:38:02] <kurushiyama> stava: Even when this means some redundancy here and there.
[11:38:37] <stava> But referencing other documents, is that an anti pattern?
[11:38:38] <kurushiyama> stava: As per your example: You'd need to have a second query for your project status when all you want to do is to display a project.
[11:41:15] <kurushiyama> So, have the status explicit: {_id:someId, projectName: "Foo", Status: "NEW",…}
[11:42:40] <stava> The reason why I wanted the status in a separate document though is because a status consists of multiple fields, most importantly "state" and "label", where state is a general constant that is recognized by the system ("new", "complete" etc), and the label may be user defined
[11:42:58] <kurushiyama> Well, now it gets interesting
[11:43:07] <kurushiyama> stava: Say you have a project list page
[11:43:27] <kurushiyama> Where you wanto to display all projects.
[11:43:53] <kurushiyama> Now, with the explicit status, you can display them right away, without additional queries, right?
[11:49:25] <kurushiyama> stava: But you asked. And you seem to understand. "You have made your first steps into a greater world!"
[11:49:39] <stava> there are two reasons why i want to do this with mongo really, one is that it integrates easily with nodejs, and the other is that its supposedly very fast
[11:49:55] <kurushiyama> stava: It is, when treated properly
[11:49:57] <stava> and thats another business requirement i guess, speed
[11:50:20] <kurushiyama> stava: Well, then strike Mongoose and put _a lot_ of effort into data modelling.
[11:51:00] <stava> with just the mongodb client/adapter?
[11:51:21] <kurushiyama> stava: As per speed: With the redundancy in data we discussed above, you reduced what would be basically a JOIN (in fact it would be multiple queries) to a simple query
[11:51:42] <kurushiyama> stava: If I were you, I'd use the shell to get my models right
[11:56:49] <kurushiyama> stava: That's actually a _very_ good metaphor for data modelling in MongoDB!
[11:57:48] <kurushiyama> stava: Some rules of thumb I use
[11:58:25] <kurushiyama> stava: "If in doubt, document points in time. As shallow documents in an own collection."
[12:00:05] <kurushiyama> stava: The reason behind this is that you can always use the aggregation pipeline to extract data from it. Conversely, do not preaggregate data too much.
[12:02:08] <stava> thats an entire subject i didnt read up too much on, and perhaps its naive of me to start modelling my data before i know these subjects :p aggregation seems like a kind of JOIN or UNION
[12:02:39] <kurushiyama> stava: As per your example, you'd log working hours as {_id: someId, start: someISODate, end: someISODate, person: dependsOnYourModel, project...}
[12:03:00] <stava> its like the aggregation in mysql, with distinct and count
[12:03:15] <kurushiyama> stava: Not only, but close enough for now.
[12:03:46] <kurushiyama> stava: the aggregation pipeline is a killer feature. You should _really_ learn it!
[12:03:56] <stava> yeah im reading the manual page on it now
[12:04:47] <kurushiyama> stava: Which brings us to a another rule of thumb: "If in doubt, prefer the aggregation pipelin over m/r."
[12:08:26] <kurushiyama> stava: Next RoT: "Know what the different levels of writeConcern, readPreference and readConcern do, and know it well."
[12:09:38] <stava> without knowing anything, is that like transactions?
[12:10:18] <kurushiyama> stava: Which is _extremely_ important in terms of consistency and durability. There are no transaction. Atomicity is only guaranteed on document level out of the box".
[12:10:46] <stava> sounds like i'll have constant "race conditions" or whatever its called, when two processes wants to modify the same document
[12:10:54] <stava> actually i dont even know what "atomicity" means :p
[12:12:04] <kurushiyama> stava: Last RoT for today: "Eevrybody started once. If in doubt, ask."
[12:12:51] <kurushiyama> stava: Hint: ask smart, as in http://www.catb.org/esr/faqs/smart-questions.html
[12:13:55] <kurushiyama> stava: I know this sounds rude. But tbh, I read this essay every once in a while to remind myself.
[12:15:31] <stava> its not rude, i'm just responding with questions, even if i can google them, because i wouldnt feel polite if i didnt respond at all lol
[12:15:50] <stava> no but i appreciate your help, and i'll certainly read up on these subjects
[12:17:05] <stava> it is obvious to me that im rushing things, i need to take time to understand how mongodb actually works and get into the nosql mindset
[12:18:27] <Ange7> kurushiyama: « well, is the server under load » no, i don't think ? but one « findOne » cmd on collection with 608 document which take 4min.... it's very slow
[12:18:56] <kurushiyama> stava: That is a very good approach. If I can help you, do not hesitate to ping me here.
[12:20:20] <kurushiyama> Ange7: Well, there obviously is a problem with your server, as my heavily loaded, 6y old MBA rushes through my test sets of more than 1M docs ;) Do you do those queries from the shell?
[12:20:56] <kurushiyama> stava: The docs are _really_ worth digging through them.
[12:22:50] <stava> kurushiyama: is there any particular book you'd recommend?
[12:23:39] <kurushiyama> stava: No. Never read a book on MongoDB. The docs are extremely well written to get you a jumpstart. After that, you might want to get to school again at university.mongodb.com
[12:23:53] <Ange7> kurushiyama: Yes from the mongo shell
[12:24:17] <kurushiyama> Ange7: unix socket or TCP?
[12:24:31] <stava> for some reason i prefer physical literature, but i'll certainly read the docs
[12:24:49] <kurushiyama> stava: They are available as PDF. Go figure ;)
[12:29:55] <Ange7> I was in TCP some days ago, and i passed in unix socket but i had the same problem
[12:31:06] <kurushiyama> Ange7: Well, we need to debug somehow. There is something seriously wrong if a 4Gb MBA with a 1.6GHz CPU beats your monster machine, no?
[12:31:29] <kurushiyama> stava: For learning, a docker image probably is the best idea.
[12:32:53] <Ange7> yes i think too lol that's why i'm here :D
[12:33:13] <Ange7> but i have no scripts in background and i have the problem in mongoshell
[12:34:00] <kurushiyama> Ange7: And hence I want to eliminate the differences one by one. First stop: use TCP, and if it is just for debugging. Just want to make sure.
[12:36:58] <Ange7> how to be sure i'm using TCP or Unix Socket ? cause my scripts connect with unix socket, but the mongo shell don't use my scripts ... lol
[12:37:49] <kurushiyama> Ange7: Well, we have to find out. Honestly, I have never used sockets. Gimme a few, need to finish a mail.
[12:57:04] <Ange7> The WiredTiger cache is only one component of the RAM used by MongoDB. MongoDB also automatically uses all free memory on the machine via the filesystem cache
[13:00:04] <kurushiyama> Ange7: In general, MongoDB uses between 85 and 90% of the physical memory. It is said to be 85%, but I have observed up to 90% in some cases. I assume aggregations or map/reduces
[13:03:33] <Ange7> i see in production note of mongo that's it recommended to start mongo with NUMA
[13:03:52] <Ange7> but when i start mongo i don't have warning about numa but it's not started with numa too ?
[13:36:19] <Ange7> kurushiyama: maybe i don't have enought RAM ?
[13:47:58] <gain_> hello to all, I'm new to mongo and I need to know if there are features to manage the order of documents inside a collection
[13:48:33] <kurushiyama> gain_: Yes. Sort them on query
[13:49:46] <gain_> kurushiyama: as noob, the first method I've though to is to add an "order" field and use it to sort, but I need to know if that's already managed by mongo in order not to add useless data in my documents...
[13:50:40] <kurushiyama> gain_: Explicitly not. Documents are guaranteed to never be fragmented, but their order is totally arbitrary.
[13:51:25] <gain_> kurushiyama: ok, thanks for the info
[13:52:22] <kurushiyama> gain_: Usually, you'd use an existing field. Say you want to oder appointments, and you'd want the 10 most current ones, you would do sth like db.appointments.find({}).sort({date:-1}).limit(10)
[13:54:26] <kurushiyama> gain_: Well, the order is not as arbitrary as it seems, but enough so that you can not rely on it.
[14:04:49] <kurushiyama> Ange7: Well, time to dig into it: Can you show the output of db.serverStatus(), db.stats() for the db in question and the config file? Please use pastebin
[14:23:26] <kurushiyama> Ange7: Well, you need to configure it correctly: https://help.ubuntu.com/lts/serverguide/NTP.html Can not help you much more with that, since I do not use or wish to use Ubuntu.
[14:36:56] <kurushiyama> Simplified: Operations require locks, as for WT it is document level locks. acquireMicros denotes how many microseconds were spent while waiting for locks, acquireCount is how often your opertions had to wait for locks and micros/count denootes how long each op waiting for a lock had to wait until it was granted.
[14:38:16] <kurushiyama> Ange7: Since you mostly do updates, as far as I can see, it might well be that there are concurrency issues. However, I am still not convinced that this is the root cause for a simple query taking 4 seconds.
[14:39:29] <Ange7> waiting please i use google translator ^^
[14:41:32] <Ange7> i have most update with $upset yes. ~5000 update / each minutes in paralels with gearman
[15:19:13] <kexmex> does mongo package use systemd?
[17:23:42] <jasvir> Hi there. I am using mongdb zips: http://media.mongodb.org/zips.json and trying to run a geoNear query. Here is the query: http://paste.ubuntu.com/15986315/. But I run this query is show: http://paste.ubuntu.com/15986338/
[17:24:06] <jasvir> can anyone please help me to figure out where I am doing wrong
[17:30:59] <GothAlice> cheeser / Derick / joannac / Number6: Old link was dead due to Freenode reorganization.
[17:32:04] <zeebee> Hey yall, quick hopefully easy question. how do I start mongod as a service but not use /etc/mongod.conf I wanna use /etc/mongod2.conf
[17:34:06] <zeebee> running mongod --config /etc/mongo2.conf works but i already have a mongo service running that uses the /etc/mongod.conf and I need a second instance
[17:36:06] <GothAlice> zeebee: You'll find that your /etc/init.d or /etc/conf.d (or similar) script which the system uses to spawn the service is referencing that configuration file in the command-line, as you tried doing.
[17:36:39] <GothAlice> One possible solution would be to make a copy of that init script, update it to point at the second configuration file, and register it as a service like the other one.
[17:47:49] <jasvir> Hi there. I am using mongdb zips: http://media.mongodb.org/zips.json and trying to run a geoNear query. Here is the query: http://paste.ubuntu.com/15986315/. But when I run this query is show: http://paste.ubuntu.com/15986338/
[17:47:53] <jasvir> can anyone please help me to figure out where I am doing wrong
[18:15:57] <saml> is there a way to unwinde object's keys?
[18:49:06] <saml> not sure why i did it as an object. err: [{errorType: "Video", ..}, ..] is more reasonable
[18:49:47] <kurushiyama> saml: Can you give a complete example doc and what you exactly want to do? Oh, and btw, edrocks tends to hang around here, too ;)
[18:50:52] <saml> I run a script. and it just inserts mongodb doc after collecting all errors for each task
[18:51:26] <saml> just wanted to see what type of errors this script found. and how many errors of each kind of error
[18:51:35] <kurushiyama> saml: I am not sure wether you you need to have an array of subdocs. I'd probably go with {errs: ["video not found","metadata invalid"]}
[18:52:18] <kurushiyama> As for the type of error: Then, I'd probably not save an error description, as this is rather a frontend thiny.
[18:52:22] <saml> yah those are types. i also log actual detailed messages like http response code was 404 for video url such and such
[18:53:08] <kurushiyama> so you basically have an error log, right?
[18:53:14] <saml> i put stuff to db instead of logging cause i thought this script will run for days. but actually it finished in 30 mins :P
[18:55:53] <kurushiyama> saml: Pardon me? Never used node...
[18:56:06] <kurushiyama> saml: Java, Go, a bit c++
[18:56:47] <saml> node.js is async out of the box best suited for massive web scale database such as mongodb. async io means it's really fast turning io bound programs to disk bound cause disk is cheap
[18:56:51] <kurushiyama> saml: And especially mgo works like a charm ;)
[18:57:23] <saml> do you like Java or Go if you had to pick one for a web application ?
[18:58:00] <kurushiyama> saml: We could argue on that. But lets come back to your problem. If in doubt, document points in time as shallow as you can. So if an error occurs, you simply log it flat.
[18:58:15] <kurushiyama> saml: Depends on the web application ;)
[19:03:02] <kurushiyama> saml: Aye. And when you need the details, you can simply query the log. db.errorlog.find({videoId:"some/…"}).sort({date:-1}).limit(whatever)
[20:13:13] <Doyle> How would you get the data size from a cluster?
[20:56:14] <zylo4747> how come when i run 'show dbs' it says my DB is 7.9 GB but when I switch context to the db and run db.stats() it shows storage size as 1.6 GB
[20:56:23] <zylo4747> is that the storage being used?
[20:56:55] <zylo4747> is the fileSize "allocated" and storageSize "used"?
[21:05:41] <jasvir> Hi, I have a query which results states using geoNear. I want to calculate total cities in each of these states. can anyone please help me to do this? Here is my query: http://paste.ubuntu.com/15991581/
[21:20:11] <jasvir> kurushiyama: can you please help?
[23:43:58] <colwem> New to mongodb. I want to dump a database as is to a file of js statements that can be used to recreate the db. I also want to be able to drop the db. This is for testing.
[23:45:37] <colwem> I guess it's not totally necessary that it's in javascript but I'd prefer it if the dump file is something I can read and edit if I need to.