[00:08:29] <donski> Hi, is it possible to have nested groupings as part of the mongo aggregation pipeline? I'm stuck on an issue where I want to have a nested grouping as the final output. I've detailed it here: http://pastie.org/8652193 if anyone could take a look and advise me
[01:31:29] <george2> can I force mongodb to use a sequential _id starting from 1?
[03:05:38] <SaintMoriarty> I am currently writing a test app to get used to NoSQL and mongo. I am using the MEAN stack and wanted to see what is the best way to take a update post in nodes and apply it to the model?
[03:15:03] <kveroneau> Hello everyone. I have a scary question about MongoDB. Although this video on YouTube was uploaded 2 years ago, I really need to know if this is still a fact, or was just FUD.
[03:15:35] <kveroneau> It says something about MongoDB not writing/flushing to disk often enough, making data loss more of a possibility.
[03:15:51] <kveroneau> Video I found is here: https://www.youtube.com/watch?v=URJeuxI7kHo
[03:16:42] <kveroneau> Is this FUD? Please say it is so, I really like MongoDB, but some things said there are sort of... well... scary for a DBA or someone who makes choices on what DB to use in an app.
[03:24:17] <kveroneau> retran, never said I am a DBA. However, I am currently in a position where I need to confirm that MongoDB will work for my client's web application. I am doing some search, and have tested MongoDB with Python and seems all too good to be true.
[03:24:57] <retran> if you want to make a real question, i suggest making a real question
[03:25:18] <retran> with things like "how often/under what circumstances does mongod flush"
[03:25:36] <retran> that's how computer scientists conduct research
[03:26:19] <retran> and then, like any good scientist, compare it to something else
[03:26:26] <kveroneau> Can the flushing be manually forced via the API?
[03:26:39] <retran> kveron, have you ever used a database before?
[03:27:42] <kveroneau> retran, yes I have used MySQL for quite awhile. I use transactions to make sure my changes are atomic.
[03:27:52] <retran> are you aware, for example, that "flush tables" in mysql doesn't actually literally gaurantee the data is written once the command is done
[03:28:31] <retran> in mysql using innodb have tons of data that is not yet flushed to disk at any given time
[03:28:44] <retran> and "flush tables" is no remedy
[03:28:51] <retran> the only remedy is mysql shutdown
[03:29:16] <retran> it has 0 things to do with the server, and everything to do with the nature of transactional processing
[03:29:27] <retran> 0 things to do with the quality of the server, i mean
[03:30:07] <retran> if you want to see heartbreaking data loss, talk to people who've done important things with Mysql
[03:32:59] <kveroneau> Hmm, this has been most of the debate about SQL vs NoSQL, each party states that their DB has data integrity and the other doesn't.
[03:34:07] <kveroneau> In the #django channel, they mentioned "risking your data integrity", whereas in this channel, you say there is data integrity.
[03:34:57] <retran> i think you're more intested in personal drama
[03:38:04] <kveroneau> I already talked to Eliza recently about my personal drama, so I don't think that's it.
[03:39:57] <cheeser> there are no known data loss issues last i heard a few weeks back
[03:41:54] <kveroneau> cheeser, thank you. That's what I wanted to hear. I am sure there are lots of developers and use in production to report solid statistics.
[03:42:12] <retran> a simple affirmation is enough for you?
[03:46:10] <kveroneau> Just awful, a search for "mongodb case studies" isn't really that helpful, with of course the top results from mongodb's website.
[03:47:56] <cheeser> straight from the horse's mouth
[03:49:13] <kveroneau> The only information to reasure cheeser's claim is the company list on the MongoDB website, which shows lots of large companies with a usually high volume of traffic using MongoDB. For example eBay wouldn't use a DB without proper data integrity would it?
[05:21:07] <Mallanaga> I'm trying to learn this stuff... and I'm using magic cards as a tool... what's a good way to denormalize a JSON file into several documents?
[09:05:28] <axi> mongodb on a raspberry pi -> segmentation fault! any ideas? pm me please
[09:05:51] <Nodex> axi, I am not sure mongodb is supported on the pi processor
[09:07:11] <Nodex> perhaps you can try this : http://c-mobberley.com/wordpress/index.php/2013/10/14/raspberry-pi-mongodb-installation-the-working-guide/
[09:08:16] <crashev> Nodex: ok, I saw that, I was expecting some way that I can redirect it to file from the console/mongo shell itself
[09:08:37] <crashev> megawolt: thx, will check it out, have not used this so far
[09:11:14] <Nodex> crashev : megawolt's offering is for EXPORTING data
[09:14:28] <megawolt> @crashev here is your sample http://pastie.org/8653084
[10:00:34] <r1pp3rj4ck> anyone else know something about this?
[10:01:12] <theblackbox> hello all, I'm just looking for a way to start the mongod as a service - it currently locks the terminal instance that inits it and this is very annoying when trying to script. I'm pretty damn sure I can't be the only one that's tackled this, but I can't seem to find any useful info
[10:09:57] <theblackbox> but I'd like to script the starting and stopping of the DB as part of my deployment, so it would be desirable to swoop in on SSH wings and --restart the mongos instance but given the way the mongod "service" behaves this breaks my deployment script - there is no way I can automate the starting/stopping of mongod
[10:10:46] <r1pp3rj4ck> theblackbox, put an & on the end?
[10:14:22] <r1pp3rj4ck> theblackbox, so is that what you were looking for?
[10:14:57] <theblackbox> Nodex: correct, but I think it was simply a matter of forgetting the trailing & on the init command. Once I've put that in place the rest should fall together
[10:46:12] <abhishek> kali: For eg i have documents in collection with multiple email, i can get unique one using db.collection.distinct('email') but this returns only email values, is it possible to fetch other values.
[10:46:31] <Nodex> aggregation framework or map/reduce it
[10:46:55] <abhishek> Aggregation would be nice for better speed i guess.
[10:47:51] <kali> yeah, aggregation is faster, but there is a 16MB limit on the result
[10:48:02] <kali> so in the end, you may need to go map/reduce
[10:50:09] <abhishek> oh, i forgot it, any way i dont think size will ever go beyond that limit.
[12:42:06] <megawolt> or http://docs.mongodb.org/manual/reference/command/shutdown/
[12:42:53] <kali> ha. that just freeze one member... basically the idea of repairing a replica set is: run repair on each secondary (in turn), then switchover and run on the former primary
[12:42:54] <scristian> I need to run repairDatabase only to reclaim disk space
[12:49:56] <kali> scristian: then you need a failover. you can either: run repair on the primary and it will let the other secondary elect one of them to primary, or: trigger the failover yourself by freeze()ing the primary, or altering the configuration, or just stopping it
[12:50:29] <kali> scristian: in all case, you'll have a few seconds of mayhem
[12:54:08] <scristian> great, thank you so much for the help
[12:55:02] <kali> scristian: you're aware you need twice the database size on disk for the repair to run ?
[12:56:01] <scristian> right now is 40G after repair will be 5G, I need twice of 40G ?
[12:57:16] <kali> scristian: mmm no. you should need the dataset size + a few GB
[12:57:27] <kali> scristian: so about 7GB in your case
[12:58:14] <kali> scristian: because another option is to just delete everything from the former primary disk, and let it pull the dataset from one of the two other nodes
[13:03:13] <chronos> hello guys, is anyone here working with Delphi and Delphi Mongo Driver i need some help with an find command?
[13:15:51] <avril14th> Hello, is it possible to stream mongodb's log from a remote instance?
[14:02:23] <bdiu> Anyone interested in a few hours paid consulting? I'm trying to come up with the ideal strategy for map/reduce for ongoing data analysis of our data set...
[14:17:29] <bdiu> I think the $100/hr area is still very reasonable for a professional... just not an agency rate
[14:34:34] <lgp171188> How do I connect to the default test database as an admin user? I have enabled auth and added a user to the admin database. Using that user account and credentials I am able to connect to the admin database, but I want to be able to connect to the test database and all the databases. How do I do it?
[15:04:19] <Razz_> I store an array of longs using the C++ API, which works just fine. However, when I try to retrieve the data again one of the documents (not all, just one) throws an assertion error 10320:BSONElement: Bad type 110 / 50 / etc
[15:04:57] <Razz_> so it seems that it goes wrong in the BSONElement::size() function, which is incapable of finding the type for that document, even though at insertion time they are all the same (NumberLong)
[15:05:26] <Razz_> google is trying to tell me the DB is corrupt, however if I set the value to 42 everywhere it works again -.-
[15:06:12] <Razz_> The odd thing is though that printing the document as JSON shows a correct JSON object and using the Mongo CLI also finds and shows the document just fine
[15:06:42] <Razz_> TL;DR I'm getting a BSONElement:Bad type <random number> error, any clues?
[15:11:07] <Joeskyyy> Quick glance at the c++ docs shows that BSONElement::size returns an int (if I'm reading this correctly)
[15:14:01] <Razz_> calling 'valid()' on the BSONObj also returns false, so the object is genuinely broken, just what about it is broken is a mystery to me
[15:16:50] <theblackbox> can I set a log level? I would like to see what command is being issued to the server from my node app
[15:21:02] <Joeskyyy> Last I heard the way to do that was to set the profiler correct, and the "slow queries" to a really low value
[15:21:56] <richthegeek> can anyone give me a brief summary on the issue with excessive data usage? I've got a DB that's 2gb now with only ~270mb of data+indexes... the database is mostly append-only so it just seems insane?
[15:23:41] <Joeskyyy> richthegeek: Has to do with record padding most likely, mongo likes to use extra data to avoid moves on disk
[16:03:33] <oceanx> hello, I've created a replicaSet, two members (one primary and one secondary) and an arbiter, i did shut off mongo before (i converted a standalone mongo to a replicaset) and copied all the content over ssh to the secondary, then started all the mongod instances configured the replicaSet on the primary and added the secondary and the arbiter
[16:04:46] <oceanx> now what's strange is that i only had a db called "applicationdb" (90GB), and now I can see there's also a db called "local" which weights almost the same on both the secondary and the primary
[16:05:02] <cheeser> local is used for replication
[16:06:33] <oceanx> in fact i could only find the oplog and a few other informations inside, is there a way to be sure it doesn't weight too much? or is it normal that it takes that much (right now I don't have problems not having enough space since it resides on a 2tb xfs partition)
[16:07:37] <cheeser> well your oplog will be largeish to replicate the changes to the secondary
[16:07:59] <cheeser> i'm not sure it should be 90G but i'm not a guru on those bits.
[16:08:38] <joannac> it's 5% of your disk space iirc
[17:29:11] <bdiu> Sorry to repeat my earlier query, but... anyone have any interest in a few hours of paid consulting to help with some architectural/query/map-reduce goals? If not, anyone have any recommendations for other individuals or companies that do this? (Not 10gen as their rate is way to high for me $450/hr... ouch!)
[18:15:12] <cheeser> figured it was either port in use or file permissions :)
[20:01:03] <darkk^> I see quite an interesting behavior: mongodb can run long read (getmore) query for minutes even if the client is already disconnected and the socket is in CLOSE_WAIT state. I see the issue on 2.0.x branch, but I've not found any relevant bugs in jira. Have anyone seen behavior like that?
[20:05:00] <paulovap> is there a way to make text search ignore punctuation like "é" and recognize it for as "e"
[20:11:41] <kali> "é" is not punctuation, it's diacritics
[20:14:49] <BlakeRG> Hello, been banging my head against the wall on this for about an hour now, i just need to remove a single value from an array in a document (PHP) https://gist.github.com/BlakeGardner/615f9ff20959d4ab969a
[20:29:16] <BlakeRG> i knew it had to be something simple
[20:35:52] <kali> BlakeRG: avoid using variable stuff as a docuemnt key... prefer { name: "About", values: ["", "" ] } or you'll regret it sooner or later
[20:40:36] <treaves> When I use appendBinData() to create an entry on a BSONObj, what is the correct way to retrieve that data back off of the BSONValue?
[20:41:20] <BlakeRG> kali: will take that into consideration, i am just writing some scripts to remove data from an existing MongoDB that i didn't design
[20:41:47] <treaves> value() returns five bytes too much.
[20:52:10] <generic_nick> i have a process that does some backing up of old data to long term storage
[20:52:21] <generic_nick> i am continually seeing this as it fails and restarts: shard global version for collection is higher than trying to set to
[20:52:36] <generic_nick> i tried restarting mongos and running flushRouterConfig
[20:53:00] <generic_nick> there are multiple processes using that mongos, but only the one reading that specific collection is having issues
[21:43:45] <hugod> I'm trying to set up MMS, but it keeps picking up an unqualified hostname, which doesn't resolve. The replica set is configured using IP addresses. Can we configure MMS to force it to use ip's?
[21:45:29] <hugod> (I have the MMS agent running on the same node as the arbiter, and seed it with `localhost`)
[22:46:00] <proteneer> my document looks like { "foo": { "someList": { "array_1": [1234] } } }, and i want to do an $push on the "array_1"
[22:46:04] <proteneer> what should my update look like/
[23:50:26] <proteneer> Joeskyyy, so find_one returns the entire document, is there a way to query for only a field within the document? so in the above case, I only want to display the 'array_1' list
[23:54:47] <proteneer> i just had to pass in -fields
[23:54:58] <justen_> What's the best way to handle saving changes to documents with populated fields? Right now I'm going through each field for each schema and changing the populated fields to _Id's before saving, but it doesn't really feel like the right solution.
[23:55:53] <cheeser> well it certainly made no sense to me.