[02:12:26] <MacWinner> how do you see the actual total space used by documents in collection? show dbs seems to show allocated disk.. but if I delete a bunch of documents, how would I see the difference before and after?
[03:16:31] <MacWinner> as part of my plan to move from a replica set to a sharded cluster, is the first prudent step to introduce the mongos router and point it to the existing replica set? i'm trying to phase in different pieces and let them burn in independently
[05:37:04] <InAnimaTe> hey all, got a quick question. I want to do an index where for a given key, another key is unique. so for my birthday(event), i can only have one yellow cake(type). Any ideas how to go about this? im guessing a unique compound index?
[09:25:18] <nicolas_FR> hi there, I need some advice on mongoose schema VS models (#mongoose is empty), so if anyone knows... (stackoverflow) : http://tinyurl.com/j24dbcw. I have the same problem (sub document CastError). Answer says to export schema and not model, but it doesn't work for me. Could someone explain me how to export the schema like answer says ?
[09:30:26] <Keksike> What might be the easiest way to log and compare all the operations that go into my mongodb before and after a change in my software?
[09:33:01] <Keksike> or not even the easiest way, but the best way :)
[09:33:36] <Keksike> I updated my softwares java-driver from 2.x to 3.0, and something started behaving differently. I want to find out what that something is.
[09:42:15] <aps> One of my replica-set members is stuck in ROLLBACK state with following logs. Any idea what could be wrong?
[09:42:45] <aps> this keeps repeating in logs? Any ideas what's wrong here?
[11:30:17] <kbn2> hello guys, can someone help me with this. I need to update objects inside an array of a document, with the document's own _id... e.g. db.c.update({}, {"test.sample_id": _id})
[11:30:42] <kbn2> hello guys, can someone help me with this. I need to update objects inside an array of a document, with the document's own _id... e.g. db.c.update({}, {"test.sample_id": _id})
[11:33:43] <SmitySmiter> hello guys, can someone help me with this. I need to update objects inside an array of a document, with the document's own _id... e.g. db.c.update({}, {"test.sample_id": _id})
[11:45:28] <SmitySmiter> hello guys, can someone help me with this. I need to update objects inside an array of a document, with the document's own _id... e.g. db.c.update({}, {"test.sample_id": _id})
[11:53:15] <kurushiyama> SmitySmiter: In case you are asking wether this can be done by query only, the answer is no. But I would be pretty interested in the _reason_ for this rather unusual upadte.
[11:53:51] <SmitySmiter> kurushiyama, I'm doing a migration from an old data structure to new data structure
[11:54:16] <SmitySmiter> I'm writing a function based on the answer posted here http://stackoverflow.com/questions/15846376/mongodb-copy-a-field-to-another-collection-with-a-foreign-key
[11:56:28] <kurushiyama> SmitySmiter: Stennie is extremely knowledgeable, and I have the utmost respect. The mentioning of a foreign key, however, let's me doubt wether it is a good idea. And for data migrations, aggregations are probably the most suitable tool.
[11:56:56] <kurushiyama> SmitySmiter: Maybe I can help you a bit later, off for 45
[11:57:19] <SmitySmiter> kurushiyama, thanks a lot, but I think I figured how I can do this with a function like that, testing on a sample db
[11:57:49] <SmitySmiter> and yea, unfortunately there were some silly decisions taken during building this stuff that brought in some RDBMS stuff to NoSQL
[12:35:22] <jokke> i have a question about $text searches
[12:36:06] <jokke> i want my search to behave so that it uses the logical AND operator instead of OR
[12:36:52] <jokke> so if i search for 'this is a test' it should only match documents with the words "this" and "test" (is and a are probably ignored due to the stemming)
[13:36:17] <m_e> lets say i have many books with a published date n my database. can i somehow select one random book for each distinct year?
[15:06:23] <adnkhu> the java program just throws an exception and exits
[15:06:35] <adnkhu> whereas in can see using the mongo shell
[15:06:40] <adnkhu> that an election has taken place
[15:06:46] <adnkhu> and a new primary has been selected
[15:06:58] <adnkhu> i have put it on the forums as well
[15:07:11] <adnkhu> but have not get any response yet
[15:10:44] <kurushiyama> adnkhu: The driver notifies you that there is no primary atm. You still have to handle this situation, for example wait an acceptable time and redo your statement.
[15:13:35] <adnkhu> but when i do rs.stepDown() it waits for a little while and continues inserting
[15:13:41] <adnkhu> without throwing any exceptions
[15:13:58] <adnkhu> shouldn't the driver handle that
[15:15:57] <adnkhu> completely independent of the application
[15:16:32] <adnkhu> where as in case of me stopping the primary process it throws and exception and exits
[15:16:57] <Derick> adnkhu: change of primary means all connections die
[15:17:17] <adnkhu> yes i can see that in the logs
[15:17:43] <adnkhu> but how does the driver continue in case of rs.stepdown()
[15:17:56] <adnkhu> i can share the code if someone can help
[15:18:09] <kurushiyama> adnkhu: rs.stepDown() is meant of (sort of) gracefully make a primary secondary, for example for maintenance work. Shutting down a primary is equivalent to a failover situation.
[15:19:37] <adnkhu> what would i need to do in code in case of hard failure?
[15:19:58] <adnkhu> as i have already provided all the three mongods in the client connection
[15:20:31] <adnkhu> MongoClient client = new MongoClient(Arrays.asList( new ServerAddress("AKhurramL2", 27017), new ServerAddress("AKhurramL2", 27018), new ServerAddress("AKhurramL2", 27019)));
[15:20:55] <kurushiyama> adnkhu: As said above: catch the exception, and then do what you deem appropriate. Some people simply wait for some time and try to redo the operation, others check the replica set status and act according to the situation, the next might just ignore it.
[15:26:41] <adnkhu> if i do rs.stepDown() it kinda waits like its buffering some inserts or something and then continues which is what i thought it should do in case of hard failure as well
[15:27:54] <kurushiyama> adnkhu: The driver is notified of the immanent step down and the elections in case of an rs.stepDown(), iirc.
[15:28:06] <cheeser> well, the driver will wait for the election to finish and discover the new primary...
[15:29:19] <adnkhu> ok thats what i wanted to know, just so that i understand it correctly, in case of rs.stepDown() java driver is notified and it waits for the election, in case of hard failure i will have to connect to the replica set again as connection is lost
[15:30:55] <adnkhu> just another quick question, if i do rs.stepDown() on primary and then let the program continue for 2 3 mins and then do rs.stepDown() on the new primary as well then again it throws an exception and exits
[15:31:07] <adnkhu> any thoughts on what is happening in this case
[15:33:17] <kurushiyama> cheeser: Do you have to reconnect? Thought you only have to make sure that election is done...
[15:34:25] <cheeser> no, no need to reconnect since the connection is still valid.
[15:37:32] <kurushiyama> cheeser: Just checked. Might have something to do with the fact that I was connecting via mongos and there were multiple redundant application servers involved pulling in messages and saving them to MongoDB. Failing simply did not remoave the message from the queue, hence I never had to check the mongos connection, really.
[15:39:02] <adnkhu> works and the connection remains valid, thanks guys
[16:32:32] <crazyphil> is there any way to get mongos to quiet down the number of messages it generates in /var/log/messages?
[16:33:01] <kurushiyama> crazyphil: If it logs requests, you should have a _very_ close look on them.
[16:35:22] <crazyphil> kurushiyama: here's an example of what I'm seeing: http://pastebin.com/TyjPDM03
[16:36:16] <crazyphil> I'm seeing about 10k messages every 15 minutes from mongos
[16:36:47] <kurushiyama> crazyphil: Use syslog, rotate to your needs.
[16:40:34] <crazyphil> but all those connection messages - is there a way to stop it from sending those at least?
[16:41:30] <kurushiyama> crazyphil: Not sure. I never bothered, since imho, using logfiles instead of syslog should be punishable by memory-scrambling or insta-segfault anyway.
[16:43:40] <crazyphil> those messages are from /var/log/messages
[16:44:10] <crazyphil> they get shipped via rsyslog into kafka, where logstash pulls, processes and pushes them into elasticsearch
[16:44:35] <crazyphil> so now ES is telling me I have a lot of messages coming from things running mongos
[17:03:42] <kurushiyama> crazyphil: I am not quite sure I understand you. You should be able to ignore unwanted messages in your syslog setup.
[17:07:05] <sumobob> How can I execute a geoJson $near query if the $maxDistance is stored in the model?
[17:22:53] <kurushiyama> sumobob: You want to have your query done based on the _result_?
[17:36:17] <cheeser> because since go the language is a 70s throwback, why not resurrect decades old imagery, too? :D
[17:36:23] <kurushiyama> cheeser: I would not argue against that. But then, I... would not use php.
[17:36:52] <hardwire> Go has been off my radar until recently.
[17:37:14] <kurushiyama> cheeser: Huh? Didn't know that you dislike Go. Personally, it feels like a rather big advancement, compared to some other languages around nowadays.
[17:37:30] <cheeser> go is ok. the tooling around it is terrible and there is no consistent way to build apps.
[17:45:37] <kurushiyama> sumobob: Maybe you should explain the actual use case.
[17:46:05] <sumobob> basically you enter the distance you are willing to travel, when someone looks for people i needs to bring in anyone who will travel to their location
[17:46:14] <sumobob> I've got it all in there as geoJSON point
[17:47:03] <kurushiyama> sumobob: Usually, you have a _known_ distance. Let's say "find all Hardee's in a radius of 20k miles" (which is the distance I am starting to accept to get a Monster Thickburger).
[17:47:39] <sumobob> gotcha, when you signup you select your travel range from an array of 5, 10, 20, 30
[17:48:00] <sumobob> could I just do 4 querys and aggregate the results?
[17:48:19] <kurushiyama> sumobob: Nah, not quite. So a document would describe a user, his location and the distance he or she would be willing to travel?
[17:49:12] <sumobob> the user who searches has the same schema and I'm doing a $near: with their location to at least sort the results
[17:54:39] <kurushiyama> sumobob: As per logic: if the user doing the search is willing to travel the distance between the users, isn't the problem already solved? The user doing the search is required to travel, then. No?
[17:55:38] <sumobob> the user who is searching needs to find people who will come to them
[17:56:31] <sumobob> yeah its tricky, the way i have it now I just calculate the distance from each person to the user, then i filter that array based on the travel_range
[17:56:40] <sumobob> but as you can see this wil not scale
[17:57:14] <kurushiyama> I might have an idea. Gimme a few.
[17:59:23] <kurushiyama> Yup, should work: You could do an aggregation, use https://docs.mongodb.com/manual/reference/operator/aggregation/geoNear/, put out the distance as a field, then do a redact comparing distance and the returned docs "willingness to travel this distance" field.
[18:02:20] <sumobob> awesome that looks great, whats a redact?
[18:04:15] <kurushiyama> sumobob: Aggregation pipelin is probably the feature in MongoDB I love the most.
[18:07:18] <kurushiyama> sumobob: But to answer your question: it is another aggregation pipeline stage command.
[18:07:51] <sumobob> so it looks like I do a $geoNear with distanceField: 'calculated_distance', then a $redact: { $cond: { if: { $gt: ['calculated_distance', 'travel_range'] }, then: '$$PRUNE', else: '$$DESCEND' } }
[18:10:14] <kurushiyama> sumobob: Looks about right. Not using redact too often, I am not sure about $$DESCEND, since you are not going to eval anything else. So $$KEEP might be better, here. Simply try.
[18:12:16] <kurushiyama> sumobob: Sorry, have to prep dinner. Simply try. If in doubt, remove the $redact stage first, to make sure the output of the first stage is correct.
[18:22:24] <MacWinner> hi, i'm trying to get a better idea of how deleted files in gridfs work (With wired tiger).. if I delete a file, then all the chunks are deleted, that space is not reclaimed I know.. however, if a new file is added, are the new chunks inserted into spots where the old chunks were? does the new file need to be exactly the same size?
[18:25:46] <kurushiyama> MacWinner: Basically, GridFS is only a convention.
[18:26:09] <kurushiyama> MacWinner: With all other respects, it behaves exactly the same as the underlying storage engine.
[18:26:15] <MacWinner> yeah.. I think the only reason I mention it is because by convention the chunk sizes are fixed
[18:26:35] <kurushiyama> MacWinner: As is the doc size ;)
[18:27:47] <kurushiyama> cheeser: Huh? A gridfs chunk of lets say 1kb does not take up 16MB, does it?
[18:27:52] <MacWinner> so if after deleting a 1MB file from GridFS (which deletes the fs.files and fs.chunks documents) then I add a 2MB file to GridFS.. will the original 1MB be reclaimed/used?
[18:29:23] <kurushiyama> cheeser: I would have been more than a little suprised.
[18:29:40] <cheeser> whether a particular patch of disk space gets reused is a function of document padding and the storage engine.
[18:33:05] <kurushiyama> Hm, which, as per "documents are never fragmented" paradigm would make it impossible for 1MB space to hold that new 2MB chunk doc, and hence the datafile would get expanded, if I am not mistaken. Well, with wT, compression might be of interest, too.
[18:42:04] <MacWinner> cheeser, sorry, could you give me an example of where the patch of disk would get reused?
[18:42:21] <MacWinner> or maybe a pointer to a doc that explains some examples for better understanding
[18:45:25] <cheeser> space on the disk is allocated based on availability. either in the already allocated slab on disk or after extended to the storage space.
[18:45:58] <cheeser> mmapv1 won't return disk space to the OS (thus continuing to grow and reuse as needed) without running an explicity repair()
[18:47:13] <MacWinner> so basically with wiredtiger, I don't need to concern myself with massive inefficiencies if I have a lot of creation/deletion of files in gridfs?
[18:47:32] <MacWinner> kurushiyama, thanks. I'll check it out!
[18:47:49] <kurushiyama> MacWinner: not as far as disk space is concerned, as far as I know.
[18:48:18] <kurushiyama> MacWinner: One could ask of the application efficiency if you have a lot of deletes ;)
[18:48:24] <cheeser> right. unused extents are returned to the OS.
[18:51:53] <kurushiyama> cheeser: To get this straight in my head: Said returning of unused extends only applies to extends at the end of the datafile, right?
[19:20:10] <cheeser> kurushiyama: what is the "end" of a collection of files?
[19:38:03] <kurushiyama> lets say we have like 5 blocks X|X|X|X|X, where X denotes a used MB in the data file. Now, if a file is deleted, say we have X|0|X|X|X. The 0 part would not be returned, if I get it right. Space would only be returned if the data at the end of the file would be deleted, resulting in X|X|X|X|0. Then, it would be "truncated" to X|X|X|X. But that is just a theory.
[20:06:19] <kurushiyama> Sagar: Thats not exactly "soon"
[20:06:22] <hardwire> Sagar: I'm not entirely convinced it'd be noticable what those updates are.
[20:06:24] <Sagar> can this help me? https://github.com/mongofill/mongofill
[20:06:32] <kurushiyama> Sagar: In IT terms, 3 years translates to "ages"
[20:06:59] <hardwire> Sagar: stick with 14.04 LTS and PHP5 if that's what works.. it'd be folly to do anything else if you're up against a big rewrite.
[20:07:10] <hardwire> now if PHP5 is coming to an end.. that's different.
[20:08:28] <Sagar> This package has been superseded, but is still maintained for bugs and security fixes.
[20:08:39] <kurushiyama> Sagar: Let me sum it up: You'd rather do an "on-the-fly" rewrite of the persistence part of your application (hence most likely unplanned and untested) to use a version of the OS with no more of an advantage than "new updates" than to use an OS which will be supported until 2019, which is proven to work for you?
[20:10:09] <kurushiyama> Sagar: If I were you, I'd do the migration to a known good platform, plan a migration to php7 then and if you have done your rewrite and tested it, plan an according update of the server OS. Just a suggestion.
[21:27:27] <shlant> hi all. Can you LOWER log verbosity? as in only log warning/errors?
[21:27:46] <kurushiyama> adnkhu: You might want to either answer your own question on dba.so or remove it.
[21:28:26] <kurushiyama> shlant: Sure. Use syslog and configure it accordingly. Writing to logfiles directly is devil's work, anyway.
[21:34:06] <shlant> kurushiyama: yea I am currently running it through fluentd to ES, so I was asking if it can be done at the mongo level, but I guess I will have to do it at the fluentd level
[21:44:21] <Derick> btw - for people using PHP 7 and not wanting to spend a lot of time upgrading from ext-mongo to ext-mongodb, there is: https://packagist.org/packages/alcaeus/mongo-php-adapter
[21:50:45] <kurushiyama> Derick: Thanks. Bookmarked for giving it as a hint.
[22:26:21] <irdan> kurushiyama: all of my nodes are secondary, and rs.status gives me "errmsg" : "not authorized on admin to execute command { replSetGetStatus: 1.0 }",
[22:36:57] <irdan> I looked in the mongo logs and noticed that right after restarting I see: [IndexRebuilder] ERROR: mmap private failed with out of memory. (64 bit build)
[22:37:52] <irdan> and I definitely have 20+GB ram free and almost a T of free disk on that host
[22:38:05] <Lonesoldier728> Hey how do I make this query work... .find({$or: {spotlight: {$exists: false}}, {spotlight: 0}, {spotlight: null}})
[22:38:38] <kurushiyama> irdan: Well, I can not say much here. Remote debugging is kind of hard... ;)
[22:38:42] <Lonesoldier728> Trying to say give me back anything that is not spotlight: 1 might be easier
[22:39:08] <kurushiyama> Lonesoldier728: Well, there is a $not operator...
[22:39:26] <irdan> kurushiyama: hehe, no problem. thanks for your help. If I run into a dead end with this I'll try to do what you outlined above
[22:40:49] <Lonesoldier728> I tried with the or gave me a bunch of errors
[22:41:59] <kurushiyama> Lonesoldier728: Such as...
[22:55:40] <Lonesoldier728> let me see trying this out
[22:57:39] <Lonesoldier728> Well that did not work
[22:58:16] <Lonesoldier728> It is returning nothing... now just to give you an idea spotlight could also not exist so is there a way to add or with the not for not exists