[00:12:59] <Derick> and this abuser was not the simple troll
[00:20:47] <cheeser> joannac: well, it's hard to say how many tried to join and just gave up. but on the other hand, not having to deal with those who are so unwilling to expend *any* effort at all on their own behalf is its own reward. :)
[00:28:09] <joannac> cheeser: sure :) I'm just going to test this out quickly
[00:29:54] <joannac> lol, mibbit doesn't support freenode
[00:36:49] <joannac> 1. Adium sucks. 2. being unregistered means you can't join or see the topic, so it relies on the user being knowledgeable enough to know what "ridentify with services" means
[00:37:17] <joannac> I dunno if that's too high a bar - thoughts cheeser, derick, ehershey etc?
[00:37:28] <joannac> maybe we should take this to another channel :)
[00:54:54] <GothAlice> joannac: Hmm, it was my impression that one could join, but could not speak unless registered, allowing both the chanserv welcome PM and topic to be shown to the user.
[00:55:33] <GothAlice> (However this may have changed in the backlog I missed. ;)
[00:55:52] <GothAlice> Ah. I see what's going on there.
[00:57:04] <GothAlice> When you set up "registered only" channels, you can specify a dumping ground to deliver unregistered users to. Try joining #python from an unidentified web client, and you'll see what I mean. (You get dumped in ##python-unregistered.) Similarly, right now if you try joining #python-unregistered, you will be directed to #python if you are identified.
[01:01:56] <GothAlice> It's one of the least friendly things to do, though; it's a pretty high and confusing barrier to entry for users for whom IRC is a new tool. S'why ##python-friendly doesn't have that requirement. (I just accept the need for closer monitoring and manual moderation. Being a small channel helps.)
[01:55:34] <cheeser> hasn't really been a problem for ##java
[01:55:50] <cheeser> we get the occasional question but that's it
[01:57:03] <latestbot> Is it possible to do two delete operations in Mongo?
[01:57:36] <latestbot> delete a document from the original table then delete references
[01:59:14] <latestbot> so I have do that separately?
[01:59:46] <latestbot> I know mongo is not a relational db, so was just wondering
[02:00:37] <cheeser> yes. you have to issue two delete commands
[02:06:08] <GothAlice> latestbot: Due to the race condition that develops because of the split operations, it's important to remember to armour the code that uses the reference against the reference not resolving to a document. (I.e. ignore invalid references when encountered.)
[02:07:31] <GothAlice> The ODM I use, sadly, doesn't do a very graceful job on this task.
[02:12:56] <afroradiohead> database theory-wise, should there only be 1 unique index in a collection/table?
[05:35:01] <GothAlice> In the case of a queue, it sounds like something that a "capped collection" would suit.
[05:35:25] <GothAlice> Ref: presentation slides and linked code: https://gist.github.com/amcgregor/4207375
[10:33:43] <mazzy> hi. having a replica set in production is highly recommendable? I mean I can put in production a mongodb set without a replica set and guarantee at the same time a good availability of the data?
[10:34:18] <mazzy> in other terms how many have in production a mongodb infrastructure without have enabled replica sets?
[10:39:22] <kali> mazzy: yes, it is very highly recommended to go replica set for production
[10:40:46] <mazzy> in my case I have few resources available in order to deploy a replica set
[10:41:25] <mazzy> I don't have three separated machines but just two
[10:43:12] <mazzy> I'm aware that in order to deploy a replica set I need three different machines where could run primary, secondary and in addition an arbitrer
[11:02:03] <cheeser> you can run the arbiter on the same machine as one of them...
[11:02:28] <cheeser> still, if you're going to production you should spring for an extra machine.
[11:02:45] <cheeser> how are you handling backups? e.g., mms requires you to run a replset for backup.
[13:10:41] <GothAlice> polydaic: People don't become "familiar" with mongoose; they "survive" it. ;P What's your reference issue?
[13:11:12] <StephenLynx> polydaic my knowledge of mongoose is "stay the hell away from it"
[13:13:06] <GothAlice> polydaic: Notably, the 99% most common "reference" issue is Mongoose encouraging users to store ObjectIds as strings. Which means you're more than doubling the required storage space (ObjectId = 12 bytes, the string version = hex encoded, or 24 bytes + 4 byte length + null terminator), and if you accidentally mix ObjectId and string on the same field, bam, you can't query on that field any more.
[13:18:11] <StephenLynx> the funny part is that mongoose is developed by mongo inc.
[13:18:20] <StephenLynx> you would expect it to not be complete cancer.
[13:26:41] <StephenLynx> then I document how the model should be.
[13:27:00] <StephenLynx> you said that was a doubled effort, but since the only reference is the documentation, is not doubled.
[13:30:32] <StephenLynx> tw, GothAlice, do you know a good option for crude hosting for databases?
[13:30:38] <[diecast]> in the createIndex examples I see some fields are quoted and others are not, is there a list of characters that need to be quoted and characters which are not allowed?
[13:30:46] <StephenLynx> I mean, I won't get much RAM and CPU, but lots of space
[13:31:52] <StephenLynx> linode will only give 24gb on the lowest plan
[13:32:19] <StephenLynx> 1tb would cost at least 500 USD / month
[13:32:31] <GothAlice> Cheapest option, truly? Buy or scrounge out of a dumpster a second-hand rack server and colo.
[13:32:43] <GothAlice> "Cloud" is, in general, a way to divorce organizations of their money.
[13:33:08] <GothAlice> For example: my at-home dataset would cost me half a million dollars a month to host on compose.io.
[13:33:14] <StephenLynx> self-hosting is not an option for me.
[13:33:29] <[diecast]> ok, found the doc for naming restrictions - http://docs.mongodb.org/manual/reference/limits/#naming-restrictions
[13:33:30] <StephenLynx> I need it to be elsewhere.
[13:33:55] <GothAlice> StephenLynx: "colo" — "Co-Located Hosting", i.e. they provide the rack, you provide the box.
[13:34:43] <StephenLynx> I had to laundry my clothes on a river once :v
[13:35:03] <GothAlice> I live in Montréal, Canada (eastern), and have collocated boxes in Vancouver (almost as far west as you can go) and Dallas, Texas. ;P
[13:35:29] <GothAlice> I literally FedEx'd machines to the data centres.
[13:35:45] <cheeser> i'll be in montreal next month!
[13:36:05] <cheeser> taking the family up during the kids' spring breaks
[13:36:17] <StephenLynx> yeah, I would die in taxes and dollar conversion.
[13:36:36] <StephenLynx> 1 USD is over 3 BRL and I make about 3k BRL per month
[13:36:41] <[diecast]> in that doc it says field names can't contain dots, but the following example works - db.serviceMetadata.createIndex({'serviceData.seriesid': 1},{background : true, sparse : true})
[13:36:43] <GothAlice> StephenLynx: I'm not suggesting that you _must_ host in another country or use another currency…
[13:36:57] <GothAlice> [diecast]: That's because a "dot" means "field in an embedded document".
[13:37:01] <StephenLynx> no but I would have to import the hardware
[13:37:10] <StephenLynx> we don't produce hardware here.
[13:38:45] <GothAlice> Basically, go cloud if you have the money and the level of trust needed. Do _anything but "cloud"/VPS_ if you want actual guarantees. ;)
[13:38:59] <[diecast]> GothAlice: is this the appropriate way to define it? db.serviceMetadata.createIndex({serviceData: {seriesid: 1},{background: true, sparse: true})
[13:39:15] <GothAlice> [diecast]: No. Follow what the documentation provides. The query syntax uses dots.
[13:39:17] <cheeser> [diecast]: no, you just use the dotted name directly
[13:39:25] <StephenLynx> I plan on hosting a linode server for the back-end on texas, so having it close to it would be better
[13:39:38] <cheeser> i just moved off linode actually.
[13:39:47] <StephenLynx> any particular reason? moved to where?
[13:39:47] <cheeser> was too expensive for a mere irc bot :)
[13:39:55] <cheeser> i loved them, though. rock solid.
[13:39:57] <[diecast]> so my previous paste is correct?
[13:40:00] <StephenLynx> I plan on running a forumhub
[13:40:06] <StephenLynx> but using them for storage is way too expensive
[13:40:29] <[diecast]> then i was confused for the reason GothAlice put "{serviceData: {seriesid: 1, …}}"
[13:40:36] <GothAlice> StephenLynx: I use http://www.fortrustdatacenter.com for my USA dedicated hosts.
[13:41:27] <[diecast]> as an example for field in embedded document, which is something i'm not doing (i think)
[13:41:45] <[diecast]> i'm just starting to learn so excuse me for not being able to keep up ;)
[13:41:59] <GothAlice> [diecast]: That's the underlying structure. The key is: you're querying it, so you need the query syntax. This is also why "." isn't allowed in a real field name, a la: {serviceData: {'series.id': 1}} — that's unintelligible to the query builder. ('serviceData.series.id' becomes ambiguous.)
[13:43:02] <[diecast]> ok so to my original paste, is that not a proper way to build the structure? should i try to break out any "." index names
[13:44:02] <GothAlice> db.serviceMetadata.createIndex({'serviceData.seriesid': 1},{background : true, sparse : true}) — your original query (I believe) is perfectly valid.
[13:44:33] <GothAlice> Because there isn't actually a document with a field named "serviceData.seriesid", but there is an embedded document (or list of embedded documents) named "serviceData" with a field named "seriesid".
[13:45:11] <[diecast]> ok, i understand it is valid. but is it a good practice or should i consider breaking out the "." during the createIndex()
[13:46:10] <[diecast]> ok, i'll learn about that more. thank you.
[13:46:32] <GothAlice> See alos: http://docs.mongodb.org/manual/core/document/#dot-notation
[13:46:37] <StephenLynx> diecast, if you are not familiar with JSON, learning it would help you greatly with mongo.
[13:47:33] <GothAlice> [diecast]: And again, for clarity, the difference here is between _defining_ data (which uses the {foo: {bar: …}} nesting approach, and _accessing_ it (which uses dot notation).
[13:47:54] <[diecast]> ok, i did get that from your texts
[14:02:12] <StephenLynx> hey, does anyone know these guys? http://www.online.net/en/dedicated-server/dedibox-scg2
[14:02:46] <GothAlice> … the picture of the server has the caption: * contractual picture
[14:02:51] <GothAlice> WTF does that even mean? XD
[14:03:40] <GothAlice> For your needs, i.e. lots-o-space, that box looks sufficient. It's single-core, so, that's a tad lame, but it certainly is cheap.
[14:04:28] <GothAlice> Also not sure what "victim of its success" means as an "availability" status. Does that mean they're out of them? … but it's green. Very confusing.
[14:04:55] <medmr> victim of its success probably means
[14:05:03] <medmr> there is a good chance of delays
[14:35:44] <greyTEO> has anyone encouted a problem when updating to wiredTiger on 3.0? I cannot see to create databases anymore.
[14:36:20] <NoOutlet> I have not looked into configuring it to prioritize identifying to be fair.
[14:38:11] <Derick> cheeser: not sure whether I like the +r...
[14:38:11] <StephenLynx> ooh, so that is what happened.
[14:38:30] <StephenLynx> joannac can you change it back? not being able to auto-join is annoying
[14:38:39] <NoOutlet> Looks like there is a fix for freenode! http://askubuntu.com/questions/6332/prevent-xchat-from-trying-to-join-channels-until-i-have-been-authenticated
[15:04:39] <GothAlice> Robomongo isn't much better than an interactive shell, anyway, and IMHO hides information from you in a very inconvenient way (collapsed tree entries shown as "{…}"; basically requiring slow mouse interacton) by default.
[15:05:22] <GothAlice> It's an anti-tool that adds to the effort needed, it doesn't reduce it. ;)
[15:09:19] <greyTEO> I use it mainly for quick glances. Shell can be a pain with a limited viewport. I do agree that is can slow things down at times though
[15:46:03] <Mxx> can any1 help me with adding existing server to mms?
[15:46:30] <Mxx> I get an error "Unexpected Error: Another session or user has already published changes"
[15:50:19] <cheeser> do you have multiple tabs open?
[15:55:57] <Mxx> i did not when I got that message
[15:56:18] <Mxx> but when I opened 2nd tab I saw there were 2 pending tasks to isntall monitoring and backup agents
[16:02:30] <Mxx> hmm now I added monitoring agent but it's stuck on "Verifying Host..."
[16:04:07] <Stiffler> I have document. It already exists, and it contains couple of infos. But now I would like to add new Array field and push there values
[16:19:06] <GothAlice> I tried to add a role to one user, accidentally reset their password.
[16:19:54] <GothAlice> The power of autocomplete isn't with me today. Ugh. ^_^ *real, not "repeal"*
[16:41:29] <GothAlice> Wewt! My analytics reprocessing benchmark didn't bring MognoDB to its knees this time. \o/ http://s.webcore.io/image/1q023q1a0B2g — 4 processes, ~1K queries/second, ~250 updates/second, 3MB/s max throughput, zero page faults, zero index misses. So happy. :3
[16:41:59] <GothAlice> s/MognoDB/MongoDB/ I need to adjust this standing desk. :/
[16:43:06] <StephenLynx> what hardware you ran it?
[16:44:48] <GothAlice> StephenLynx: App server running the benchmark/migration is a Rackspace 4GB Performance VM, the DB hosts are 3 1GB General Purpose v1 VMs. Testing a lower memory allocation was also part of it, and why MongoDB exploded for me whenever I tried out WiredTiger. ;)
[16:46:18] <GothAlice> So, this reprocessed a total of 178,517 documents in ~4 minutes.
[17:04:06] <GothAlice> rendar: The bulk of those documents are full records of request/response cycles. We mine this general purpose "activity" data for explicit click actions, then pre-aggregate the result into hourly segments. Normally this is done live, i.e. a new activity record comes in, it's also pre-aggregated into the hourly segment, but all of that data can be rebuilt from the source material.
[17:04:10] <GothAlice> We have a migration that does this if it finds the pre-aggregated data to be missing. Makes for a great benchmark. :)
[17:07:10] <ttaranto> hi! one tip, you could put all mongodb videos at your youtube channel, so we can watch then on smartv, I can't do it with this video: http://www.mongodb.com/presentations/webinar-fast-querying-indexing-strategies-optimize-performance
[17:09:17] <cheeser> ttaranto: you should send an email via that "contact us" link. get that request in the pipeline.
[17:09:18] <GothAlice> ttaranto: More things need to support HTML5 video. If this is helpful at all, http://cdnbakmi.kaltura.com/p/1067742/sp/106774200/serveFlavor/entryId/1_hkq7osow/v/1/flavorId/1_s8lv3c19/forceproxy/true/name/a.mp4 is the "video source" for that one.
[17:09:34] <cheeser> the people that deal with that stuff don't hang out here.
[17:09:42] <GothAlice> But yeah, YouTube is certainly "highly accessible" across devices.
[17:12:48] <GothAlice> latestbot: Could you fix your internets, please? ;) Your connection has been up and down like nobody's business.
[17:19:51] <latestbot> I will surely look into it again
[17:26:20] <GothAlice> latestbot: https://gist.github.com/amcgregor/d6518eef1eb2071d12ab is my (redacted) znc.conf, but it includes a "znc --makeconf" tool, too. (If you use mine as a template, search for NOPE and use the "znc --makepass" command to generate new hashes.)
[17:26:29] <GothAlice> For whenever you get a chance to look into it. :)
[17:26:57] <latestbot> That’s so nice of you! thanks GothAlice!
[17:49:45] <cheeser> GothAlice: i've been debating writing a java lib to do that...
[17:50:30] <GothAlice> … unless one has extremely specific requirements and sporadic interactions with the oplog it's universally better to stream the bits you care about the same way MongoDB does, using a tailing cursor. (You can still filter to only the operations you care about, of course, but you get "live" notifications of newly added operations and it never needs to search through older data.)
[17:50:48] <GothAlice> cheeser: Don't do it, man! Java, woah. XP
[17:51:16] <cheeser> i've spent the last 20 years with it. not likely to stop now. especially for *javascript*
[17:51:44] <GothAlice> The type dynamism of MongoDB and the… exact opposite of that in Java always made me scratch my head. Doesn't seem like a good fit, or at least, if one is to make it fit there'll be a fair amount of boilerplate.
[17:52:12] <unholycrab> GothAlice: this is pretty cool. im looking for something that happened a few hours ago, though
[17:52:19] <cheeser> it's a great fit really. our most popular application platform is java.
[18:53:44] <GothAlice> For example, /var/lib/mongodb. Only the _contents_ of the folder get recreated. Also, check permissions. Make sure whatever mongod wants to run as has permissions there. Also, check the logs. MongoDB is pretty good about giving error messages in the event of a problem.
[19:23:42] <GothAlice> abishek: See also: http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/
[19:24:34] <phutchins> Hey, if I point an app at a the master in a replica set, will it still be able to read from slaves? or only to that master ?
[19:25:35] <cheeser> phutchins: the drivers take that server (list) you give as a seed. they'll discover the topology and talk to the primary unless you use a read preference to force (or simply allow) a secondary read.
[19:25:53] <GothAlice> phutchins: Each server knows of the others, which is pretty useful.
[19:25:59] <phutchins> Basically i'm having a problem where doing an rs.remove on a node in our replica set crashed our app until it was restarted. I'm fairly certain that this was an issue with the connectionTimeout setting on the driver in nodejs which I've fixed but want to be cautious when testing the fix...
[19:26:17] <phutchins> cheeser: yep, so i should be able to give it a single master and it will still discover
[19:26:49] <phutchins> GothAlice: yeah definitely. I'm trying to point one of our workers directly at the master to avoid the issue i mentioned again if it happens...
[19:27:03] <cheeser> yes. though you should use a larger set as a ward against that machine being unreachable for whatever reason.
[19:27:13] <phutchins> GothAlice: but I think that the same thing may result since it still is just a seed node and gets the config
[19:27:37] <phutchins> cheeser: right, this would just be temporary while testing the fix. Not sure if there is any benefit tho
[19:28:10] <phutchins> so if i pointed at the master, and not the config server, it would still need to reconnect as the rs has changed, correct?
[19:31:03] <cheeser> why would you point it at the config server?
[19:32:16] <phutchins> cheeser: well it is that way now. The last time we ran rs.remove, we had to restart all of our workers to get the app going again. The mongo driver in nodejs shouldnt' have a problem hadning this however. I updated the config to increase the connection timeout which should solve it but i'm trying to think up a fallback so that we can try the change again (rs.remove) but if it does the same thing as last
[19:32:22] <phutchins> time, wondering how I can make one of the workers not fail
[19:33:07] <phutchins> cheeser: maybe i meant mongos
[19:34:37] <girb1> is there a way in mongos router where in I can say all read queries should go to secondary in sharded cluster ?
[19:35:01] <girb1> I know we can mention it in mongo client
[19:35:13] <phutchins> cheeser: i attempted to duplicate the issue but was unable to in our staging envirohment which is why I think it has somethign to do either with load on the mongos or the cluster itself such that it slows down reconfigure time
[19:36:28] <phutchins> cheeser: so the way i understand it, there probalby is no good way to work around this with a single worker. If its going to happen again, its going to happen...
[19:43:34] <d4rklit3> assertion: 13 not authorized on admin to execute command when trying to run mongorestore
[19:43:41] <Bookwormser> Can someone explain what is wrong with this query? I am trying to count all between 2 unix timestamp ranges, but I get no results. There are records with these timestamps though: db.mycollection.count({'m_time' : {'$gte' : '1426550400', '$lt' : '1426636800'}})
[19:47:40] <GothAlice> Bookwormser: Double check your actual stored data. Make sure those are being stored as numbers (4 bytes) instead of strings (4+10+1=15 bytes), then update your query to query them as numbers.
[19:51:11] <FlynnTheAvatar> Hi, are there already Centos7 RPMs for mongo 2.6.9?
[19:51:46] <GothAlice> FlynnTheAvatar: Have you tried: http://docs.mongodb.org/v2.6/tutorial/install-mongodb-on-red-hat/
[19:52:45] <FlynnTheAvatar> GothAlice: Yes, I tried it. But it seems 2.6.9 rpms are missing at http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/RPMS/
[19:53:42] <GothAlice> Well then. :D FlynnTheAvatar: Could you open a ticket on http://jira.mongodb.org for this?
[19:57:11] <GothAlice> FlynnTheAvatar: http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel70-2.6.9.tgz (via http://www.mongodb.org/dl/linux/x86_64) may be sufficient to get you going.
[19:57:30] <FlynnTheAvatar> GothAlice: Sure. In project Core Server?
[19:59:14] <FlynnTheAvatar> GothAlice: We have working Dockerfiles that install MongoDB with yum...
[20:03:11] <vipul20> there's a log file in mongodb but no config file found in /etc/init.d and checking on port 27017 displays no result
[20:03:39] <FlynnTheAvatar> GothAlice: We created the issue (SERVER-17738)
[20:03:45] <GothAlice> vipul20: Log files don't go in the initialization script folder. /etc/mongod.conf exists?
[20:03:50] <vipul20> but I am running a project by which I can make operations in mongodb
[20:04:29] <vipul20> GothAlice: no /etc/mongod.conf doesn't exists
[20:05:06] <GothAlice> vipul20: sudo lsof | grep mongod | grep LISTEN — this will tell you what IP and port mongod is listening on; if nothing's there, mongod might not be running: ps aux | grep mongod
[20:05:52] <GothAlice> vipul20: Finally, to see what the command line of the running mongod process was, run: sudo cat /proc/`pidof mongod`/cmdline
[20:06:05] <GothAlice> (MongoDB doesn't use a configuration file unless explicitly told to on the command line.)
[20:07:46] <GothAlice> (The command line might look a bit garbled, like there are no spaces between options, when printed out this way. That's OK and normal.)
[20:08:11] <vipul20> GothAlice: ps aux is giving me the pid
[20:08:35] <vipul20> GothAlice: but the lsof command is not giving me any information
[20:08:39] <GothAlice> The "ps" + grep command here should emit two lines.
[20:11:48] <GothAlice> FlynnTheAvatar: Also, thanks. I'm not sure if anyone with the ability to correct the missing RPM camps in this channel or not, so a JIRA ticket is the way to go to make sure it sees some action. :)
[20:13:06] <cheeser> GothAlice: reasonably sure that answer's no
[20:15:10] <FlynnTheAvatar> GothAlice: Yeah, sure. I just wanted to check if somebody was already working on it.
[20:15:18] <GothAlice> vipul20: Some notes: a) it's binding to 127.0.0.1 only, meaning only local access is allowed. (Generally a good thing. ;) b) It's trying to use /etc/mongodb.conf for configuration. c) It looks like you have some stuck data, there. Did the server crash or was it otherwise abruptly restarted in the recent past?
[20:16:11] <GothAlice> vipul20: Specifically, it's reporting damage to files in /var/lib/mongodb/ (the journal). You may need to manually run mongod (with the same options as the init.d script tries to), but with the addition of --repair.
[20:16:32] <GothAlice> (To repair, though, you'll want 2.1x as much free space as the size of your data.)
[20:19:05] <vipul20> GothAlice: please can you tell me the exact command?
[20:19:58] <GothAlice> First, does /etc/mongodb.conf exist? ('db', not just 'd' at the end there)
[20:25:09] <vipul20> it is giving me this output "Thu Mar 26 01:54:10.689 Can't specify both --journal and --repair options"
[20:25:10] <GothAlice> (Luckily you had no options set in the config that I would have had to disable.) Didn't technically need to include the dbpath a second time, but I had already written it out just in case. ;)
[20:25:51] <vipul20> GothAlice: what should i do next?
[20:26:37] <GothAlice> You're running a rather ancient version of MongoDB, it seems. 2.4? In this instance, we can't repair the journal. Instead, the only option left is to remove the dead journal files and let MongoDB recreate it, losing whatever may have been stuck there.
[20:28:37] <GothAlice> Step one there removes the bad journal. Step two attempts a standard repair (without configuration and journalling option)—if your journal was corrupt, there may be other issues, though hopefully not. This effectively exports and re-imports your data in a safe way. Then, you can try to start up the server as normal.
[20:31:34] <vipul20> GothAlice: thanks that worked :D
[20:31:53] <GothAlice> Heh; triple check that your data looks OK.
[20:32:13] <GothAlice> Corrupt journal = bad news bears, and can be a sign of broader issues like disk failure.
[20:36:24] <FlynnTheAvatar> GothAlice: That was quick, the RPMs are now available
[20:44:21] <FlynnTheAvatar> well, personally I thought it was a slight oversight and not important enough for commercial support. the RPMs are the community version anyways
[21:13:54] <d4rklit3> i have no access to this databse
[21:15:19] <GothAlice> d4rklit3: MongoDB users are per-database. If your admin user is registered against the admin database, you'll need to remember to set your --authenticationDatabase when using command-line tools against other DBs.
[21:15:34] <d4rklit3> is there a link on how to do this
[21:23:05] <wutze> Hi there has the noew mongodb-java-api-3.0 a getDocument() and getArray() functions? With the old driver there was only a get() and i had always to cast my values
[21:23:10] <GothAlice> {field: {operator: value}} — "$eq" is an operator.
[21:23:26] <nemothekid> yeah I have the equal, the count is the same as if I don't have the eq
[21:23:59] <GothAlice> nemothekid: Until you gist/pastebin the actual query you're running… {"eq" null} is what you wrote, and it's wrong.
[21:24:52] <GothAlice> nemothekid: Hmm, looks like it's still running into the BSON comparison order. :/
[21:25:12] <GothAlice> $type semi-works, but arrays interact somewhat strangely with it, too.
[21:26:07] <GothAlice> Further confirmation that mixing types in a single field name = bad mojo. You may have to manually sift through the records returned, rejecting ones for which that field is an array application-side. (Pro tip: fix the data.)
[21:26:30] <nemothekid> yeah I'm trying - a bug created these null fields
[21:27:12] <nemothekid> $type: 10 (array) returns 0, but type 4 (null) returns everything
[21:27:26] <nemothekid> manual might the only option
[21:28:18] <d4rklit3> GothAlice, ok so root access... Error creating index cf-develop.system.users: 13 err: "not authorized to create index on cf-develop.system.users"
[21:28:24] <GothAlice> You can complicate the query a bit… {assets: {$eq: null, $not: {$elemMatch: {$eq: null}}}}
[21:28:49] <GothAlice> nemothekid: This will select documents whose "assets" field is null, or an array containing null, then exclude the ones that have array elements that are null. ^_^
[21:36:17] <GothAlice> The amount of success you have, doing anything, will be highly variable because of that. "cf-develop" isn't a valid symbol name in any sane language (being interpreted as "cf minus develop"). One should keep their database and collection and field names within the realm of the following regular expression: [_a-zA-Z][_a-zA-Z0-9]*
[21:39:53] <GothAlice> However, you should use your admin user to create a database-local user with sufficient permissions to backup/restore and use the DB. I.e. readWrite@<dbname> and/or dbAdmin@<dbname>
[21:40:05] <GothAlice> Then you can also avoid the --authenticationDatabase thing.
[21:54:31] <GothAlice> http://docs.mongodb.org/manual/release-notes/3.0-downgrade/#downgrade-path < yeah, 'cause auth can't be downgraded. Okay, if you nuked it, I'm not sure what the issue is.
[22:01:59] <GothAlice> Hmm, now that I think about GUIs, there *is* an ncurses interface I use: bpython. (I use ipython most of the time, though.) "screen" + "bpython" + "pymongo" == robomongo in a terminal.
[22:02:00] <d4rklit3> lol if i urlencode auth fails!
[22:13:18] <GothAlice> Well, anyone who Code Commanders for 4 days straight is in for a painful following day after filtering 4 days of accrued toxins.
[22:20:17] <GothAlice> "Core Dump" – to relieve oneself while ill or under duress; often explosively. See also: "captain's log". "Offline maintenance" – to become unconscious, generally on a regular schedule.
[22:37:58] <d4rklit3> GothAlice is there an online source for these?
[22:38:19] <GothAlice> Nah, these are coming from my brain.
[23:02:41] <aliasc> why you should never use mongodb
[23:03:53] <aliasc> is mongodb good for social networking website ?
[23:04:55] <GothAlice> aliasc: https://blog.serverdensity.com/does-everyone-hate-mongodb/ and it can be great for general data storage on a social-style site. For storage of the actual social connectedness graph, though, use a real graph database. (Or write a graph back-end for MongoDB 3. ;)
[23:05:44] <GothAlice> There's a tool called mongo-connector which can greatly aid in having "data" in MongoDB and "graph" elsewhere.
[23:05:55] <aliasc> the post says, if you link documents with ids you are going against basic concepts of mongodb
[23:06:03] <aliasc> and relationships in a social networking website are a must
[23:06:22] <GothAlice> To an extent, yes. MongoDB isn't relational, so you can't perform efficient joins.
[23:06:44] <GothAlice> However, even most popular FOSS SQL systems can't handle deep graph searches. Graph databases are their own bucket of bolts.
[23:07:26] <aliasc> we are talking about simple blog lets say
[23:07:43] <GothAlice> Blogs don't need graphs, so MongoDB makes for a great solution.
[23:08:36] <GothAlice> Put two different types of values in the same field name in a collection and watch your queries crumble.
[23:08:43] <aliasc> well i also feel like if you follow the rules of relationships you are doing it wrong in mongodb
[23:08:50] <GothAlice> So, ensuring there aren't conflicts is somewhat like a "minimum effort schema".
[23:09:14] <GothAlice> A graph database is needed to answer queries like: find me all friends up to four connections away from user X who also have user Y in common.
[23:10:49] <aliasc> im running a software with mongodb in the backend for checking new videos and channels in youtube cms
[23:11:01] <aliasc> and updating attaching logos on thumbnails
[23:11:06] <GothAlice> However, in the simple relational example of forums (forums::threads::replies, all one:many) turns into an almost-relational model in MongoDB: replies are embedded in threads which reference forums.
[23:12:05] <GothAlice> http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html points out the important task: identifying what a "document" is, in your model, is key.
[23:13:06] <GothAlice> A reply to a thread… doesn't really deserve to be its own top-level document. It's fully dependant on the thread, should get cleaned up with the thread, etc. And MongoDB atomic operations and array manipulation ($push, $elemMatch, etc.) allow you to embed it naturally.
[23:15:23] <aliasc> in a social website a user who liked the post may be embedded into the post
[23:15:30] <aliasc> that user might also be friend
[23:15:37] <aliasc> and that user might also be commenter
[23:16:10] <aliasc> embedding and duplicating data means moving through each one of them to modify on changes
[23:16:39] <GothAlice> https://github.com/bravecollective/forums/blob/develop/brave/forums/component/comment/controller.py#L110-L134 is my forum "voting" code (like giving it a thumb up)
[23:17:02] <GothAlice> Note, I'm not storing anything but the ID of the user who voted up, and incrementing a pre-aggregated sum.
[23:19:00] <aliasc> im new to mongodb i admit it. and there are times where i feel like im not cooperating with the concepts of mongodb
[23:19:19] <aliasc> instead im fighting between the balance of schemaless and er
[23:19:35] <GothAlice> It's not easy if you come from an existing database background. There is much to unlearn.
[23:19:50] <GothAlice> (Databases can be waaay cooler than a fancy spreadsheet.)
[23:20:06] <aliasc> true, i found mongodb useful for bulk operations, the software i wrote performs incredibly well
[23:23:13] <aliasc> Diaspora a project on kicktarter was using mongodb, they felt betrayed
[23:23:18] <GothAlice> Heck, I'm a Python developer running her code in Java to abuse Java's excellent natural language processing and neural network libraries, storing most data in MongoDB and graphs in neo4J.
[23:24:06] <GothAlice> aliasc: Betrayed by what? Most big "MongoDB is crap" blog posts have no basis in reality and only serve to demonstrate that the author failed to read the documentation.
[23:24:36] <GothAlice> … or was clearly misusing the DB. I.e. to store graphs.
[23:24:43] <aliasc> true true, i dont always believe what a frustrated shitty developer wrote on his blog
[23:25:15] <GothAlice> You think your database is fast now…
[23:25:18] <aliasc> honestly i try to avoid posts that say mongodb sucks but you need to face some
[23:26:11] <aliasc> im not saying if mongodb is professional database im trying ti figure it out if its the best choice for my project
[23:26:17] <GothAlice> (TokuMX is a fork of MongoDB… on LSD. It uses fractal trees for storage and change propagation, avoiding almost all forms of locking during operation.)
[23:27:08] <GothAlice> aliasc: Again, why make only one choice? (We use MongoDB for data, neo4J for graphs. If we _really_ needed transactional safety, say, for financial operations, I'd add postgres to the mix.)
[23:27:13] <GothAlice> There is a FOSS version, yes.
[23:27:59] <GothAlice> Oh, TokuMX also supports true transactions.
[23:28:59] <GothAlice> None of my databases use permanent storagae.
[23:29:23] <aliasc> why having limits to how long documents can be ?
[23:30:06] <GothAlice> I.e. the VM starts, tries to connect to the cluster. If it can't it pulls out of deep storage the latest snapshot and the oplogs between the snapshot and now, replays them, then spins up two more hosts automatically to use as replication secondaries. If an existing cluster is found, it pulls the latest snapshot, applies it, then joins the replica set and caches up from the live oplog.
[23:30:16] <GothAlice> aliasc: To limit the impact of worst-case scenarios.
[23:31:11] <aliasc> can you give me a ling of how you would model your data for a blog/portal website in mongodb
[23:31:51] <GothAlice> Pro tip: if writing a "blog" site, don't do comments yourself. (Use one of the hosted comment tools like Disqus; comment moderation is a PITA to write.)
[23:32:46] <GothAlice> If you don't need to deal with comments, then, a blog is what, {_id: …, slug: "your-code-style-guide-is-crap-but-still-better-than-nothing", title: "…", lead: "…", body: "…", modified: …} and that's about it.
[23:34:08] <GothAlice> Copy and paste into Notepad, my friend. ;)
[23:34:49] <aliasc> i already modeled my data like this im trying to see if im doing it wrong :P
[23:34:51] <GothAlice> Yup. Since you might want to display the author name with the blog post (pretty typical) you might store this instead: {… author: {id: ObjectId(…), name: "Alice"}}
[23:35:07] <GothAlice> This will save an extra lookup for each post.
[23:35:27] <GothAlice> (If you ever change your name, just $set the new value across all matching posts.)
[23:36:24] <aliasc> this is what i was trying to discuss, so you need to set the new value across all matching posts
[23:36:27] <GothAlice> Also, in many blogs a "category" is basically just a simple string tag. Store the string tags in the posts, not ObjectId references to a collection of these tags. :)
[23:36:48] <aliasc> also in many blogs categories can appear in the main menu
[23:37:14] <aliasc> you cant put a category in a post as a string if you mean to categorize posts
[23:37:42] <GothAlice> You can use an aggregate query to get the set of all unique categories from the posts themselves, or even pre-aggregate it. (I.e. store the string version in the post, but _also_ insert it into a collection of categories, with the string as the _id. Then, generating the menu is one simple query.)
[23:37:52] <GothAlice> aliasc: Yes you can. The string is the unique key. :D
[23:38:27] <aliasc> like this you are not modeling you are fighting opening new doors for problems
[23:38:36] <GothAlice> {title: "How to screw up your MongoDB schema design", category: ["MongoDB", "Java"]}
[23:39:06] <aliasc> give me all categories so i can put them in the header menu
[23:39:07] <GothAlice> Then there are no database lookups to show the list of categories on each post. :)
[23:40:12] <GothAlice> db.posts.aggregate([{$unwind: "$category"}, {$group: {_id: '$category'}}]) — all categories ever mentioned, but only included in the list once, each.
[23:40:47] <aliasc> hundreds of thousands of posts to query for 10 categories
[23:41:05] <GothAlice> I mentioned pre-aggregation of the categories above.
[23:41:31] <aliasc> bare with me maybe i dont understand the full concept of mongodb. i said im new
[23:41:35] <GothAlice> When you add a post, attempt to insert each category into a "categories" collection. I.e. db.categories.insert({_id: "MongoDB"}) — if it blows up, it's already there.
[23:41:51] <GothAlice> (And yes, that's the sum total needed for a "category" document in this case.)
[23:42:19] <GothAlice> Then, getting the list of categories is: db.categories.find()
[23:43:06] <GothAlice> This isn't data duplication: it's query optimization. :D
[23:43:25] <aliasc> its hard to think this way when you come from other database background
[23:43:53] <aliasc> like what the heck is this good or bad approach
[23:44:10] <GothAlice> It's an approach. "Good" and "bad" require measurement.
[23:46:51] <GothAlice> Straight up relational, with ObjectId references to categories, would cripple read performance by requiring you to perform an extra query (potentially several depending on how well written your loops are) just to look up the category names. Same if you embed _only_ the ObjectId of the author. That'd make displaying one posting = 3 queries (optimized). Showing a paginated listing of them? Yeah… 1 query to get the initial list, 2 extra
[23:46:51] <GothAlice> queries for every row you display, if you stream the rows out.
[23:47:46] <GothAlice> (You can optimize this to only three queries total, too, if you pull in all the data before emitting anything out, but that's sub-optimal from a responsiveness perspective.)
[23:49:16] <GothAlice> Often it's beneficial to show _something_ quickly, even if it takes longer to fully load _everything_. :)
[23:50:57] <GothAlice> I hope all of this has been somewhat helpful to your understanding, and not just adding to the confusion.