PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 13th of March, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:00:13] <quan2m> Ok.. serious n00b question.
[00:01:06] <quan2m> I moved the db directory...
[00:01:13] <quan2m> and I can't get the permissions to work out.
[00:01:39] <quan2m> I've taken the horrible step of disabling SELinux, and the file permissions are correct.
[00:20:04] <tsyd> Is it bad practice to use the MongoDB _id in application logic?
[00:43:45] <daidoji> tsyd: depends on if you need idempotent keys or not
[00:44:42] <GothAlice> tsyd: Also, depends on if you consider the data in ObjectIds sensitive or not, in case they are shown to a user (i.e. in a URL).
[00:47:27] <Jonno_FTW> I'm using $where:"this.datetime.getDay()==4", but it just returns all records, not those on 4th day of week
[01:07:13] <joannac> Jonno_FTW: demonstration?
[01:07:42] <joannac> Jonno_FTW: i.e. pastebin a document that gets returned that shouldn't
[01:15:01] <mdavid613> hey all, question for you. I'm trying to setup a new replicaSet with x509 user / member authentication. I have a primary setup with a replicaSet enabled, x509 and users setup etc…when I try to rs.add() the other host I'm getting back "exception: need most members up to reconfigure, not ok" from the Primary. From the new secondary I'm seeing this in the logs: Unauthorized not authorized on admin to execute command { replSetHeartbeat:
[01:15:26] <mdavid613> does anyone have any ideas what I may be doing wrong? Also, the new secondary member has a blanked data directory
[01:16:07] <joannac> mdavid613: erm, no subscription?
[01:16:39] <mdavid613> not sure what you mean by that, you mean subscription to 10gen for support?
[01:17:11] <joannac> right. i thought x509 was an enterprise only feature
[01:19:02] <mdavid613> it works for user authentication on a standalone
[01:19:07] <mdavid613> that I've already tested
[01:19:38] <mdavid613> and the x509 for cluster members documentation doesn't mention enterprise only support
[01:23:13] <joannac> mdavid613: huh, interesting
[01:23:17] <Jonno_FTW> joannac: http://pastebin.ws/88eu2g
[01:23:34] <joannac> mdavid613: your certs are probably wrong then
[01:24:32] <daidoji> when does Mongo write when inserting/updating files?
[01:24:37] <daidoji> or rather, when does count() get updated?
[01:24:49] <joannac> Jonno_FTW: http://www.w3resource.com/javascript/object-property-method/date-getDay.php
[01:24:55] <Jonno_FTW> joannac: I think I found the problem, getDay() is the UTC date, but my dates are time zone adjusted
[01:25:06] <Jonno_FTW> by 11 hours
[01:25:10] <daidoji> actually disregard, found my issue
[01:25:19] <joannac> Jonno_FTW: no, you misunderstood what that function does
[01:25:22] <joannac> Jonno_FTW: http://www.w3resource.com/javascript/object-property-method/date-getDate.php
[01:28:07] <mdavid613> joannac: the certs are correct, I copied them there myself…I'll try again
[01:28:09] <mdavid613> thanks
[01:29:12] <Jonno_FTW> joannac: what am I doing wrong then?
[01:29:28] <joannac> Jonno_FTW: did you read the 2 links I sent you?
[01:29:33] <Jonno_FTW> yes
[01:29:44] <Jonno_FTW> oh
[01:30:02] <mdavid613> joannac: yeah, I just copied again and still the same error sadly :(
[01:31:57] <Jonno_FTW> how come ISODate("2013-06-06T03:05:00.000Z").getDay() == ISODate("2013-06-05T14:30:00.000Z").getDay(), I don't follow
[01:32:12] <Jonno_FTW> the 6th and th 5th are different days
[01:32:50] <Jonno_FTW> or was it the same day according to my local time?
[01:32:52] <joannac> Jonno_FTW: huh
[01:32:57] <joannac> they're not the same for me
[01:33:19] <Jonno_FTW> so is it using my local time and treating the isodate is GMT?
[01:33:29] <joannac> Jonno_FTW: http://pastebin.com/8HtzEF5q
[01:36:12] <Jonno_FTW> joannac: http://i.imgur.com/wk0s3bj.png
[01:38:34] <joannac> Jonno_FTW: that's not the mongo shell
[01:39:02] <joannac> Jonno_FTW: if you're using something other than the mongo shell, I can't comment on what they do with ISODates
[01:39:15] <Jonno_FTW> joannac: it's just a gui connection to a mongo shell
[01:40:40] <joannac> well, i dunno what to tell you
[01:40:47] <joannac> in my mongo shell it works fine
[01:40:52] <joannac> what version?
[01:41:23] <Jonno_FTW> 2.4.6
[01:41:48] <joannac> can you do it in the actual mongo shell?
[01:42:36] <Jonno_FTW> yes, and I got the same result
[01:42:47] <joannac> screenshot?
[01:44:07] <Jonno_FTW> joannac: http://i.imgur.com/id81SOn.png
[01:46:56] <Jonno_FTW> joannac: I actually wanted getUTCDay()
[01:46:58] <Jonno_FTW> thanks for the help
[01:47:42] <joannac> Jonno_FTW: that's still really weird!
[01:52:09] <Jonno_FTW> joannac: I think it's because in my timezone, they would be the same day
[01:55:11] <joannac> Jonno_FTW: oh, finally got it to repro. okay
[01:58:27] <chanced> would you use mongo for a calendar schema or is there an alternative db with features that could make the process of handling recurring events easier?
[01:59:17] <cheeser> a date-oriented database?
[02:00:04] <chanced> which would that be?
[02:01:11] <chanced> nm, i'll look around
[02:55:37] <GrubLord> Hi all.
[02:56:19] <GrubLord> Quick question about replica sets: sure, the servers vote among themselves as to who’s primary, but how does my app know which one to send the request to?
[02:56:48] <GrubLord> Do I query one of them for the primary, somehow, and then set the request to that one?
[02:56:56] <joannac> create a replica set connection, the driver will poll the replica set and figure out which one is primary
[02:57:22] <GrubLord> hm - OK… so you connect to the mongodb differently
[02:57:56] <GrubLord> Ah, I see. Thanks, that helps a lot.
[02:59:54] <dbclk> hey guys
[03:00:07] <dbclk> i'm having a really fucked up problem with mongodb replicaset
[03:00:10] <dbclk> can anyone help me?
[03:01:27] <joannac> dbclk: sure, but more details needed
[03:02:31] <dbclk> the issue I'm having is that I'm trying to connect mongo slaveTo to it's parent
[03:02:53] <dbclk> however, it said it can't connect to it over port 27017
[03:02:59] <dbclk> I can ping the primary box
[03:03:04] <dbclk> but, from mongo I can't connect
[03:03:06] <dbclk> any ideas?
[03:03:18] <joannac> i hope you mean secondary and not slave
[03:03:31] <dbclk> yes
[03:03:39] <joannac> from the secondary, what does `mongo primaryhost:primaryhost` say?
[03:03:46] <dbclk> let me see
[03:03:48] <joannac> does it connect or does it say "refused"
[03:03:49] <dbclk> one sec
[03:04:10] <joannac> sorry, primaryhost:primaryport
[03:05:14] <ladessa> I;m trying to do a OR query...but it doesn't work..its returning nil..but in RoboMongo the query works...the query is http://mysite:3000/obterOcorrencias?query={"$or": [ { "_id":ObjectId("54ff5ed8d094b1e371fba0a7")} ]} https://www.irccloud.com/pastebin/0UwB1sLy
[03:06:47] <joannac> why do you have only one clause?
[03:08:11] <ladessa> how so?
[03:08:25] <joannac> ladessa: have you checked the query you're sending looks correct?
[03:08:42] <joannac> ladessa: $or usually is only useful with multiple clauses
[03:08:45] <ladessa> yes, if I try it on RoboMongo, the query reutrns ok
[03:08:48] <dbclk> on sec joannac installing back the nodes
[03:08:50] <joannac> if you have 1 clause, the $or is superfluous
[03:09:17] <ladessa> yes,,but it can have more clauses...it depends ....its dinamic in my app
[03:09:24] <ladessa> has*
[03:09:54] <joannac> ladessa: print the query before you send it to mongodb. make sure it looks how you expect
[03:10:02] <ladessa> ok, one moment
[03:11:05] <ladessa> {"$or": [ { "_id":ObjectId("54ff5ed8d094b1e371fba0a7")}, {"_id":ObjectId("54ffcc00bef7ea3b78d11789")} ]}
[03:11:21] <joannac> okay
[03:11:37] <joannac> incread the log level on the mongod and see what it gets
[03:11:41] <joannac> increase*
[03:12:55] <ladessa> https://usercontent.irccloud-cdn.com/file/29onumzm/Captura+de+Tela+2015-03-13+a%CC%80s+00.11.59.png
[03:13:06] <ladessa> look the robomongo's print
[03:13:10] <ladessa> the same query
[03:13:21] <joannac> that's not what I asked for
[03:13:37] <ladessa> what do you asked for?
[03:13:48] <joannac> increase the log level on the mongod and see what it gets
[03:14:03] <ladessa> how do I this?
[03:14:35] <joannac> db.adminCommand({setParameter:1, logLevel:1})
[03:15:58] <ladessa> I've increasead
[03:16:11] <ladessa> I ran again nothing happens
[03:16:28] <joannac> what do the mongod logs show?
[03:16:52] <ladessa> where I see these logs? In vps's terminal?
[03:17:08] <joannac> ... you don't know where the mongod logs are
[03:17:21] <joannac> how did you start the mongod?
[03:17:25] <ladessa> server responded with a status of 400 (Bad Request)
[03:17:31] <ladessa> its the chrome console
[03:17:59] <joannac> ...
[03:18:04] <joannac> where is the actual database?
[03:18:43] <ladessa> in a vps
[03:18:53] <ladessa> when node.js fails..it appears in terminal
[03:18:58] <ladessa> but nothing is appearing
[03:19:09] <ladessa> only the result is { }
[03:19:12] <ladessa> wihtout error
[03:19:37] <joannac> so log into your vps and go look for the mongod logs
[03:21:14] <joannac> I need to know what the server is getting from your app
[03:26:26] <ladessa> https://www.irccloud.com/pastebin/kqSEfB35
[03:26:51] <joannac> that's not what your app is sending
[03:27:16] <joannac> look for the string "occorrencias"
[03:27:45] <joannac> or, run your app again, you should still get no results
[03:27:56] <joannac> then look in the last 50 or so lines of mongod log for the query
[03:39:57] <ladessa> @joannac
[03:40:49] <ladessa> It works..but only 'findOne' 'find' not works https://www.irccloud.com/pastebin/NmFKDUyV
[03:42:29] <joannac> are you going through the cursor when you do find()?
[03:42:42] <joannac> find() gives a cursor, you have to iterate through the cursor to get documents
[03:56:51] <ladessa> aah, ok..i'm trying to convert like:
[04:00:02] <ladessa> https://www.irccloud.com/pastebin/IahYSiG5/It+works+great%21+the+only+problem+now+is+use+query+from+the+url+param
[04:13:59] <ladessa> https://www.irccloud.com/pastebin/ZsS3PnFt
[04:31:28] <dbclk> joannac: this is what I'm getting trying to connect
[04:31:28] <dbclk> http://pastie.org/10022292
[04:33:27] <joannac> so did you do check the mongod is started and there are no firewalls in the way?
[04:33:35] <joannac> also... where is that message from?
[04:33:54] <dbclk> you're talking about local firewall on primary right joannac ?
[04:34:08] <joannac> or firewall on the secondary
[04:34:44] <dbclk> ok I believe that's IPtable..using centos
[04:34:50] <dbclk> not sure how to do it but, let me google
[04:36:12] <joannac> dbclk: before you do that, can you log into the primary, and check whether the mongod is running?
[04:36:23] <joannac> dbclk: if so, test if you can connect
[04:36:24] <dbclk> sure
[04:36:34] <dbclk> primary is running
[04:36:37] <joannac> if you can connect, let me know the output of db.serverCmdLineOpts()
[04:36:41] <dbclk> how do i check if I can connect on primary?
[04:36:46] <dbclk> ok
[04:36:53] <joannac> type mongo 12.7.0.0.:27017
[04:36:58] <joannac> or whatever the correct port is
[04:37:00] <dbclk> ok
[04:37:02] <joannac> 127.0.0.1
[04:37:06] <joannac> damn dots
[04:41:06] <dbclk> joannac: I can connect over 127.0.0.1:27017
[04:41:13] <dbclk> this is the output from the command
[04:41:14] <dbclk> http://pastie.org/10022297
[04:42:18] <joannac> dbclk: thanks. go back to checking out IPtables
[04:42:32] <joannac> for some reason your secondary server can't contact the primary server
[04:42:43] <dbclk> ok
[04:42:44] <joannac> that's a network issue... i can't help with that :(
[04:43:02] <dbclk> but, it sounds like the firewall on primary is causing it right?
[04:44:59] <ladessa> {"$or": [ { "_id":ObjectId("54ff5ed8d094b1e371fba0a7")}, {"_id":ObjectId("54ffcc00bef7ea3b78d11789")} ]} -? Why am I getting SyntaxError: Unexpected token O in ObjectId?
[04:45:24] <joannac> dbclk: hard to say where the problem is other than "somewhere beetween the 2 servers"
[05:07:23] <ladessa> I have finally resolve my problem!
[05:07:39] <ladessa> the _id in param must be converted to ObjectId
[05:23:55] <dbclk> hey joannac is it possible from the secondary I run the mongo command to connect to remote on port?
[05:25:44] <joannac> yes, i thought that
[05:25:48] <joannac> s what you tried?
[05:29:57] <dbclk> I tried this joannac mongo IP 27017
[05:30:04] <dbclk> go this error, file does not exist
[05:30:11] <dbclk> i think he's looking for the name of the DB
[05:30:17] <dbclk> how can I specify what DB to connect to?
[05:40:30] <joannac> do you actually have the mongo shell isntalled?
[10:26:29] <sterichards> test
[10:26:48] <sterichards> Is it possible to import nested CSV data in to mongodb?
[10:29:21] <kali> nested csv ? what is that ?
[10:30:11] <sterichards> Well JSON supports array data, CSV has to be hacked to support it. Can mongo support any kind of hacked CSV structure that’s “supporting” arraays
[10:37:02] <Derick> you need to convert it to json yourselv
[11:24:43] <boutell> Morning! How do I ensure read-after-write consistency when my code might be running on a replica set? That is, how do I ensure that I can always read what I’ve just written? And is it possible to set the “default write concern” so that I don’t have to patch every insert() call? I’m using the node driver. Thanks.
[11:26:46] <KekSi> 'morning, is there an alternative to running mongo --eval "" on the commandline now that eval is deprecated?
[11:27:43] <cheeser> echo "" | mongo
[11:28:07] <KekSi> one of my old scripts uses mongo --eval "printjson(db.fsyncLock())" and *Unlock
[11:28:38] <KekSi> (doing LVM snapshots and forcing consistency before doing the snapshot)
[11:35:35] <MatheusOl> KekSi: Notice that is advisable to keep the connection where you issued fsyncLock from opened and do fsyncUnlock from the same one, as fsyncLock can in some circumstances block new connections
[11:38:09] <pamp> Hi
[11:50:51] <KekSi> i know, never happened to me though
[12:05:35] <arussel> would the index be used if querying a regex case insensitive ? /^foo/i
[12:31:45] <amitprakash> Hi, not as much a mongo question as an application using mongo question, I wish to be able to attach future timestamps to documents which when expire(datetime.now() == future_ts) trigger some sort of callback/service. These future timestamps can change before expiry setting them to a closer/farther point in the future
[12:31:56] <amitprakash> What would be a good way to handle this?
[12:32:47] <cheeser> a thread in your app to query for expired docs
[12:33:20] <amitprakash> So something that polls of db.coll.find({future_ts < now }) ?
[12:33:25] <amitprakash> s/of/for
[12:33:57] <amitprakash> cheeser, aight thanks
[12:44:22] <GothAlice> amitprakash: If the callback is for cleanup, a TTL index will do what you want, and lets MongoDB handle the periodic clean-up.
[12:45:13] <GothAlice> I typically have an "expires" diatomite, with a TTL of zero. Any value of None is set to not expire. :)
[12:45:26] <GothAlice> s/diatomite/datetime/ ugh, autocorrect
[12:48:24] <amitprakash> GothAlice, nope not clean up, more like reminders to users
[12:49:23] <amitprakash> GothAlice, cheeser, I see a small problem with the polling approach, a particular document can have an event expire and then be updated to a future ts between two consecutive polls
[12:49:41] <amitprakash> this would lead to the poller never catching the expiry
[12:49:42] <GothAlice> Indeed, I was typing up the "optimum solution". ;)
[12:49:58] <amitprakash> aight :D
[12:50:31] <GothAlice> A slightly more efficient method would be to have your thread with a wake-up queue, and a capped collection to notify when new dates are inserted.
[12:51:24] <amitprakash> GothAlice, that should work except when we are doing a couple of million notifications/events ?
[12:51:40] <amitprakash> pretty large wake up queue that way
[12:52:08] <GothAlice> Thread A "tails" the capped collection. When you update a future_ts time, push the ID of the record and the new time to the capped collection. Thread A wakes up, sees the new time, and pushes it onto the queue. The "worker" thread wakes up, reads the new expected timestamp and ID, then waits on the waker queue for the time between now and the future time. After it executes a task, it queries for the next task, time-wise, then waits that
[12:52:08] <GothAlice> long on the wake-up queue.
[12:52:32] <GothAlice> Basically all that matters is the record with the closest-to-now timestamp.
[12:52:56] <boutell> How do I ensure read-after-write consistency when my code might be running on a replica set? That is, how do I ensure that I can always read what I’ve just written? And is it possible to set the “default write concern” so that I don’t have to patch every insert() call? I’m using the node driver. Thanks.
[12:54:03] <GothAlice> boutell: Saw your question from earlier, but you left. ;) Two ways: since writes need to go to the primary anyway, for queries where you must have immediate consistency, issue your query on the primary. Replication lag can't bite you, then. Or, if you are willing to accept slower writes, set a write concern of N (for the number of secondaries you have), then even reads from secondaries will be consistent.
[12:55:19] <GothAlice> boutell: And yes, you can set a default write concern (and read concern) when connecting your client app. I.e. the "w" "j" and "async" arguments to MongoClient: http://api.mongodb.org/python/current/api/pymongo/mongo_client.html
[12:55:28] <GothAlice> s/async/fsync/
[12:55:47] <GothAlice> (It'll be similar for Node.)
[12:57:23] <boutell> GothAlice: interesting. I’m not sure how solution one helps me, in that my read might hit a secondary which hasn’t replicated what I just wrote to the primary yet, right? I see how solution two works.
[12:57:36] <GothAlice> amitprakash: https://gist.github.com/amcgregor/4207375 < this "actual work collection" and "notification queue" method is what I use for my distributed RPC, including scheduled (not just immediate) task execution. A single host with two producers, four consumers benchmarked at 1.9 million RPC calls per second. (So… should be pretty efficient.) The capped collection size in your case could be quite small: only times are being pushed there.
[12:57:40] <amitprakash> boutell, w=num doesn't help ?
[12:57:52] <amitprakash> GothAlice, aight, will look into this. thanks again :D
[12:57:54] <boutell> amitprakash: that’s “solution two” and yes, it does help.
[12:58:00] <GothAlice> boutell: I thought I was clear, if you explicitly read from the primary, and _not_ a secondary. That's solution one.
[12:58:08] <boutell> GothAlice: oh, I see.
[12:58:32] <amitprakash> the idea with r/o replicas is to ensure read heavy apps do not affect writes
[12:59:11] <amitprakash> however for mission critical reads you should either stick to primary or have to take a hit on perf via w=N given theres latency b/w replication
[12:59:20] <GothAlice> Just so.
[12:59:26] <oznt> hi everyone, I have a db ca 300GB big, I added an empty replica member (I did not preseed it). It started recovery, and the amount of files was growing for about 18h. I saw in the logs it was building indecies, but after a while (I have had already 114Gb copied) all the data was gone from the replica, now I have again ca. 22 Gb, and I don't know what to look for in the logs
[12:59:34] <oznt> I would appreciate some advice here
[12:59:53] <boutell> So if I’m writing an open source npm module that needs read-after-write behavior, how would I accomplish that without knowing the number of secondaries going in (since npm modules are popped into other people’s projects)? I’m getting complaints that, for instance, the connect-mongo module’s sessions don’t appear to work, presumably due to the replication issue.
[13:00:13] <GothAlice> oznt: Sounds like your oplog isn't large enough to handle all of the activity that happens while a replica is seeding itself.
[13:00:22] <amitprakash> boutell, either read from primary or require user to specify N in some setting/config
[13:00:26] <amitprakash> or argument
[13:00:30] <GothAlice> oznt: http://docs.mongodb.org/manual/tutorial/change-oplog-size/
[13:01:02] <oznt> GothAlice, yes, my Oplog is small, (only 7GB) But I am afraid to change it without having another replica member or at least a backup...)
[13:01:24] <GothAlice> oznt: You'll either need to increase the oplog size to accommodate possible activity during seeding (i.e. it needs to store everything up while the new secondary populates itself, so that the new secondary can catch up fully) or pre-seed.
[13:01:44] <GothAlice> oznt: When in doubt, have backups. ;)
[13:02:00] <boutell> interesting. So I can’t determine n at start by querying the server or something? I guess another replica could be added at any time, though. Of course that’s also a problem when you preconfigure n.
[13:02:01] <oznt> GothAlice, my backup policy was replication ...
[13:02:19] <boutell> (thanks for the input folks.)
[13:02:47] <oznt> GothAlice,Mongodb docs say not to use mongodump and mongorestore
[13:03:13] <GothAlice> …
[13:03:15] <GothAlice> wat?
[13:03:24] <GothAlice> oznt: Link to that?
[13:04:01] <GothAlice> boutell: You can ask the client driver for the full list of known replicas.
[13:06:55] <GothAlice> boutell: http://api.mongodb.org/python/current/api/pymongo/mongo_client.html#pymongo.mongo_client.MongoClient.nodes (for the Python version of this, should be similar on other languages)
[13:09:13] <oznt> GothAlice, "
[13:09:13] <oznt> Important
[13:09:13] <oznt> Always use filesystem snapshots to create a copy of a member of the existing replica set. Do not use mongodump and mongorestore to seed a new replica set member." found this here: http://docs.mongodb.org/manual/tutorial/expand-replica-set/
[13:09:45] <GothAlice> oznt: Ah, I was referring to simply having a backup of your data. A task for which mongodump excels.
[13:10:47] <GothAlice> boutell: Testing that out in the Python version, upon first connection (seeding the server list with the primary only) and prior to any secondary-capable queries, my .nodes only include the seed node. Hmm. Further down the pipe, you could also ask the server itself using a command.
[13:13:37] <GothAlice> oznt: At the moment, likely due to a lack of actionable monitoring catching that your oplog is no longer large enough for the amount of data you have, you have no choice but to either increase the oplog size, or pre-seed a node _then_ increase the oplog size to ensure a node going down and needing to be re-seeded won't simply die.
[13:15:24] <dbclk> folks i'm trying to connect to mongo on a remote server using an IP on port 27017
[13:15:30] <dbclk> i'm getting connection refused
[13:15:33] <dbclk> any ideas?
[13:15:42] <dbclk> I even telnet ..connection refused
[13:15:48] <dbclk> mongo is running and listen
[13:15:51] <boutell> dbclk: it may be set to bind only on 127.0.0.1. Hopefully.
[13:16:00] <dbclk> as I can connect to it from localhost
[13:16:02] <boutell> dbclk: in that, if no other security policy is in place, it would be dangerous not to restrict that.
[13:16:17] <boutell> dbclk: do you have password security place for this database?
[13:16:23] <dbclk> yes I do
[13:16:31] <GothAlice> Authentication won't stop a connection.
[13:16:38] <GothAlice> dbclk: a) It's not actually running. b) mongod is only bound to localhost, not 0.0.0.0 (all interfaces). c) You have firewall rules preventing access. d) True local access often happens over an on-disk socket, not a network TCP socket, so you may even have TCP disabled.
[13:16:38] <dbclk> I check the config.ini file, not seeing anything "bind"
[13:16:55] <dbclk> firewall is disabled
[13:16:57] <dbclk> selinux
[13:16:58] <boutell> config.ini? Wouldn’t it be /etc/mongod.conf
[13:17:27] <GothAlice> dbclk: "Firewall disabled" often blocks everything, for security sake. SELinux… causes more problems than it solves, for most inexperienced users.
[13:17:47] <GothAlice> (Well, blocks everything except SSH access from the service net on my machines.)
[13:18:04] <boutell> SELinux: because a computer that’s off is secure
[13:18:09] <GothAlice> Yup.
[13:18:30] <dbclk> boutell: for us config.ini is mongod.conf
[13:18:37] <dbclk> i'm not seeing a bind address
[13:19:24] <dbclk> we install system-config-firewall-tui and turn off firewall
[13:20:03] <dbclk> i'm sure it's isn't firewall as I would only get connection timeout
[13:20:07] <dbclk> and not connection refused
[13:20:14] <boutell> right
[13:20:19] <boutell> that’s why it sounds so much like bind
[13:20:20] <GothAlice> dbclk: From within the host running mongod, run and pastebin/gist the result: sudo lost | grep mongod | grep TCP | grep LISTEN
[13:20:31] <GothAlice> s/lost/lsof/
[13:20:47] <dbclk> ok
[13:21:12] <GothAlice> This will give you the authoritative answer to "is mongod listening to a TCP port?"
[13:21:19] <GothAlice> (And if so, what is it bound to.)
[13:21:28] <boutell> oooh, nice one
[13:21:36] <kali> sudo lost, great
[13:22:04] <GothAlice> sudo lsof — kali: Your IRC client doesn't auto-apply s/search/replace/ corrective regexen automatically? ;)
[13:22:07] <dbclk> is there another command I can run besides "lost"
[13:22:12] <dbclk> isn't find in my repo
[13:22:14] <GothAlice> dbclk: lsof, not lost.
[13:22:14] <dbclk> or installed
[13:22:17] <dbclk> oh
[13:22:49] <kali> GothAlice: my brain does
[13:23:01] <GothAlice> You can also get this information from /proc, but it's a bit more of a PITA to do it that way. ;)
[13:23:57] <GothAlice> I get, on my development machine: mongod 535 amcgregor 6u IPv4 … TCP localhost:27017 (LISTEN) — bound to localhost only, running as me, on the default port.
[13:25:14] <dbclk> http://pastie.org/10023040
[13:25:19] <dbclk> that's the result
[13:25:53] <GothAlice> <dbclk> folks i'm trying to connect to mongo on a remote server using an IP on port 27017 — false. You're listening on 28017.
[13:26:12] <GothAlice> Try connecting on the right port, and it'll work a whole lot better. ;)
[13:26:22] <boutell> GothAlice: there are two ports in that output.
[13:26:51] <GothAlice> boutell: Hmm, yeah. Eye jumped to the longer line.
[13:26:57] <GothAlice> That's weird.
[13:27:31] <dbclk> hmmm
[13:27:45] <GothAlice> Okay, not weird. That's in a replica set.
[13:27:59] <GothAlice> (My test machine has only one port, my production has both, like the paste.)
[13:28:03] <dbclk> GothAlice: but, I saw 27017 in the list
[13:28:07] <GothAlice> Indeed.
[13:28:11] <dbclk> but,?
[13:28:20] <dbclk> you say i'm wrong
[13:28:23] <GothAlice> dbclk: So your problems don't relate to mongod specifically. It looks like it's doing the right thing.
[13:28:43] <dbclk> what's the right thing?
[13:29:02] <GothAlice> dbclk: I misinterpreted your paste as one long, wrapped line, instead of two. Your monogd is bound correctly to 27017 for client connections, but it's also expecting to be in a replica set, thus 28017 also being bound.
[13:29:35] <dbclk> ok
[13:29:48] <dbclk> well i can't bound to 28017 either funny thing is
[13:29:49] <GothAlice> If it thinks it's in a replica set, is it?
[13:29:55] <dbclk> it is
[13:30:05] <GothAlice> And do the logs indicate normal startup?
[13:30:05] <dbclk> that's why I'm trying to connect to it from a remote
[13:30:10] <dbclk> yes it does
[13:30:33] <dbclk> the funny thing about this server is that it has two nics /2 IPs
[13:30:49] <GothAlice> ('Cause MongoDB won't actually accept connections until it says as much in the log after initial startup and replication init.) Your mongod process is bound to all IPs.
[13:31:01] <GothAlice> (TCP *:27017 < the * there.)
[13:31:17] <dbclk> 10.1.1.x and 10.1.6.x, I can connect to the server over 10.1.1.x (which is an extremely busy network) but, I can't connect over 10.1.6.x (which is a quiet network)
[13:31:37] <dbclk> I know
[13:31:43] <dbclk> that's why i can't understand it
[13:31:51] <Derick> firewall?
[13:32:09] <dbclk> you mean like an external firewall?
[13:32:12] <dbclk> dont think so
[13:32:48] <GothAlice> dbclk: Could you paste/gist the output of: sudo iptables -L # (-L = list) On a machine with "no firewall enabled" the INPUT, FORWARD, and OUTPUT "chains" should all be empty.
[13:32:52] <dbclk> I can telnet to the machine on the quiet port
[13:32:54] <dbclk> but, no mongo
[13:32:57] <dbclk> telnet over ssh
[13:33:01] <dbclk> port 22
[13:33:08] <GothAlice> Yeah, that's not telnet, that's SSH.
[13:33:21] <dbclk> ok
[13:35:19] <dbclk> GothAlice: http://pastie.org/10023057
[13:35:36] <GothAlice> Yup, that's a clean firewall alright.
[13:36:07] <dbclk> it's really weird I can only connect from one IP but, not the other
[13:36:20] <GothAlice> However, it really sounds like you've got a general Linux sysops issue at the moment. mongod is behaving correctly. The IRC channel for your distribution (which I'm not familiar with) may have more helpful advice than the simple diagnostics I'm stepping you through. Or, more generally, #linux. Is the network interface you want to use (the "quiet" one) configured? (ifconfig to check)
[13:36:38] <dbclk> ok
[13:36:51] <dbclk> thanks for the assistance :)
[13:37:48] <GothAlice> If ifconfig reports the expected IP address on that interface, next is to check the routing tables to make sure there's a valid route for that interface. (The route command for this.)
[13:38:24] <GothAlice> Then, from within the VM, run: ping -I ethN IP
[13:38:43] <GothAlice> (replacing ethN with the correct eth1, eth2, etc. network device you want to test, and IP with the IP of another node on that "quiet" network.)
[13:39:23] <amitprakash> GothAlice, whats the workaround bit @ 2-capped-collection-records.js ?
[13:39:41] <GothAlice> amitprakash: Full link? My short term memory is a very tiny ring buffer. ;)
[13:39:52] <amitprakash> GothAlice, https://gist.github.com/amcgregor/4207375
[13:40:02] <GothAlice> Ah! Right.
[13:40:54] <GothAlice> If you try to "tail" a capped collection that contains no records, the query will immediately fail instead of waiting for new data to be added. So you have to "prime" the capped collection with a bogus/no-operation record to allow the tailing queries to properly wait.
[13:41:13] <GothAlice> Weird implementation detail. ^_^
[13:41:20] <amitprakash> oh okay
[13:43:34] <GothAlice> amitprakash: From the full implementation (link at the bottom of the slides gist) https://gist.github.com/amcgregor/52a684854fb77b6e7395#file-worker-py-L79-L81 automatically handles injection of a "no-op" record into any empty capped collection that gets tailed.
[13:45:21] <amitprakash> yep, reading the same
[13:46:03] <GothAlice> amitprakash: I'm working on generalizing that code and producing a proper package for it: https://github.com/marrow/task
[13:47:11] <GothAlice> Two of the goals that go beyond the original gist: a) allow remote execution of Python generator functions, streaming the generated data to any listeners, and b) a Futures-compatible API, so it can be a drop-in replacement for multi-processing and multi-threading concurrency approaches.
[14:20:59] <dbclk> question folks..has anyone find the issue with a mongo server that has 2 nics/ 2 IPS (different subnet network) though mongo is configued to listen on all port, you're only able to connect to one?
[14:48:22] <StephenLynx> listen on all ports? are you sure?
[14:48:35] <StephenLynx> dbclk
[14:50:43] <dbclk> StephenLynx: yes it is
[14:51:12] <StephenLynx> how did you managed to do that?
[14:51:28] <StephenLynx> and why would you need it in the first place?
[14:59:17] <GothAlice> StephenLynx: Those questions are ancillary to him not being able to generally connect. (See the IRC log from earlier today for the sysop diagnostic steps we went through to ensure that mongod is, in fact, behaving correctly in his environment.)
[15:00:18] <StephenLynx> so he is INDEED using every single port for mongo? nothing else is able to listen after mongo is running?
[15:00:49] <GothAlice> dbclk: I run my DB VMs with four interfaces: loopback, internet-facing (MongoDB doesn't listen here), ServiceNet (nor here), and DataNet. ServiceNet bridges all VMs and lets me access SSH and whatnot without exposing SSH to the internet. DataNet is a custom VLAN interface which is common to only my own VMs, which MongoDB communication happens over.
[15:01:27] <GothAlice> StephenLynx: Pedantry doesn't help solve problems. It's listening on all interfaces, again, feel free to check the log to see the other things we examined, including firewall, etc.
[15:01:44] <StephenLynx> i am not being pedant, I am really curious because I have never heard about this kind of thing.
[15:01:50] <StephenLynx> ah
[15:01:53] <StephenLynx> all ips.
[15:01:57] <GothAlice> Yes.
[15:02:01] <StephenLynx> got it :v
[15:02:17] <GothAlice> "Listening on all ports" isn't a thing. ;)
[15:03:21] <GothAlice> (Esp. considering half of them are allocated, by default, to outbound connections, i.e. ports 32767+. :)
[15:04:41] <diegoaguilar> I just installed mongodb 3
[15:04:51] <diegoaguilar> and I got this warnings http://www.hastebin.com/ijevovavin.vbs
[15:05:06] <diegoaguilar> I looked it up and found some about it
[15:05:13] <diegoaguilar> but I got not clear how to fix this
[15:05:32] <diegoaguilar> or if I really should care about
[15:06:18] <StephenLynx> I got that too after I upgraded too.
[15:07:11] <PedroDiogo> hey everybody, can anyone recommend me a good database as a service ?
[15:07:26] <diegoaguilar> mongolab orchestrate
[15:07:35] <diegoaguilar> what do you need PedroDiogo, what are ur needings
[15:07:46] <GothAlice> diegoaguilar: Yeah. It's a performance thing. The kernel is configured to allocate larger-than-normal "pages" of memory to applications. This can lead to virtual memory map fragmentation in MongoDB, I believe.
[15:07:51] <diegoaguilar> StephenLynx, you solved anything? or just didnt care about it
[15:07:54] <PedroDiogo> i'm new to mongo and I want to test the performance of it when using a chat-kind of app
[15:08:04] <PedroDiogo> im already using orchestrate for another app
[15:08:08] <diegoaguilar> nice :)
[15:08:13] <GothAlice> (And active degragmenting, the second option it's warning about, is also a performance thing. You don't want chunks of memory being moved around if you can avoid it.)
[15:08:32] <diegoaguilar> so what should I do to fix this GothAlice ?
[15:08:34] <StephenLynx> haven't deployed yet, so I am procrastinating about it.
[15:08:42] <GothAlice> Disable those options in your kernel configuration.
[15:08:46] <PedroDiogo> this app will actually communicate with this new chat one
[15:08:49] <GothAlice> Or at least, don't have it set to "always". ^_^
[15:09:00] <diegoaguilar> PedroDiogo, I guess mongo will fit great,
[15:09:07] <PedroDiogo> as heroku does not offer local mongodb support, I want to test the performance of other database as a service
[15:09:10] <diegoaguilar> do you really need a database as service "service"?
[15:09:16] <PedroDiogo> diegoaguilar: not quite
[15:09:24] <PedroDiogo> i actually wanted to do it locally
[15:09:41] <PedroDiogo> with mongo as a deamon on a unix VPS and using mongoose with node
[15:09:41] <diegoaguilar> well installing mongodb is quite straightforward
[15:09:52] <PedroDiogo> yeah i know i've been testing it locally
[15:09:58] <diegoaguilar> trying out wired tiger with mongo is ...
[15:10:06] <diegoaguilar> really promising for me
[15:10:13] <diegoaguilar> GothAlice, can u guide me on how to do this?
[15:10:53] <GothAlice> diegoaguilar: Not really, at the moment. Though certainly asking in #linux or a channel dedicated to your distribution may be useful.
[15:11:04] <diegoaguilar> ok GothAlice
[15:11:09] <PedroDiogo> that is why I first wanted to go with a VPS; however, I'm a bit concerned about backups and scaling... I'm guessing that is something that mongolab or any other service would be helpful. Am i right ?
[15:11:27] <diegoaguilar> a last question, even in order to assist PedroDiogo , is wired tiger enabled by default in mongodb3?
[15:11:44] <GothAlice> diegoaguilar: No, WT is not the default engine.
[15:11:51] <diegoaguilar> I cant say anything on mongolab as Ive not used it
[15:11:53] <GothAlice> WT is as yet unproven in terms of production use.
[15:12:01] <diegoaguilar> but I guess u can always set up replications
[15:12:20] <PedroDiogo> hm ok tks
[15:12:21] <diegoaguilar> I saw this famous tweet from a facebook engineer
[15:12:26] <GothAlice> (My own benchmarks see ~15% improvement in typical query performance, and a healthy 50% or so reduction in disk usage, but I managed to make a VM kernel panic under load, so…)
[15:12:32] <diegoaguilar> trying WT and getting an AWSOME compression rate
[15:12:56] <diegoaguilar> so I really really shouldnt use it on production????
[15:12:57] <diegoaguilar> :O
[15:14:42] <GothAlice> PedroDiogo: I run a 27 TiB cluster on three physical boxen at home and due to the size of the data utilize filesystem snapshotting trickery to do offsite backups. At work, our cluster is more… reasonably sized. There, we use hidden replicas to stream a backup from production to the office, one of them is delayed by 24h to allow us to undo user mistakes, and we also make regular mongodump snapshots.
[15:15:51] <GothAlice> PedroDiogo: As a note, we've never had a node actually _fail_ in four years, and until a recent security patch that required a reboot, my boxen had 3-year uptimes. ¬_¬
[15:16:10] <Derick> you're not on AWS then :)
[15:16:15] <GothAlice> (3-year *continuous* uptimes.)
[15:16:27] <GothAlice> Derick: Heck no. Our reliability went through the roof once we migrated away from AWS.
[15:16:32] <diegoaguilar> I wish I could work in such a PRODUCTION environtment :P
[15:16:41] <diegoaguilar> 27TB
[15:16:42] <diegoaguilar> :P
[15:16:45] <PedroDiogo> that is crazy GothAlice :)
[15:16:45] <diegoaguilar> omfg
[15:18:03] <GothAlice> So accounting for the tiny amount of time that reboot needed, we've got something like five nines averaged across three years.
[15:18:29] <PedroDiogo> GothAlice: can you share with me some tips to maintain a good DB working, IF i go with a VPS ?
[15:18:54] <diegoaguilar> Is MMS worth trying for small projects?
[15:19:00] <GothAlice> diegoaguilar: Very much so!
[15:19:12] <GothAlice> My work cluster is MMS-managed, and it's free for < 8 nodes.
[15:19:34] <GothAlice> Also eases things like user account management.
[15:21:36] <GothAlice> PedroDiogo: a) Find a reliable hosting provider. Cheap != good in this regard, esp. as many VPSes over-provision. (I.e. they promise and sell more capacity than they have, hoping everyone uses a fraction of what they're given.) b) Automate, automate, automate. MMS is a good start, but to be truly reliable you need to be able to kill -9 a random DB VM and still have your application survive.
[15:22:44] <PedroDiogo> thats great, thanks ;)
[15:22:46] <GothAlice> (This means potentially automating the spinning up of new VMs, and _automatically_ having them join the existing cluster on startup in a way that won't risk "run on the bank" situations, which can cause cascade failures.)
[15:23:01] <diegoaguilar> GothAlice, do you recommend Digital Ocean?
[15:23:09] <PedroDiogo> right
[15:23:16] <PedroDiogo> yeah, which VPS do you recommend?
[15:23:29] <diegoaguilar> Ramnode, Digital Ocean ...?
[15:23:41] <GothAlice> diegoaguilar: No, I actually recommend staying away from them. (I have several multi-year outstanding tickets with public histories with repeated "We're working on it, it'll be done in three months" every six months.)
[15:25:26] <GothAlice> I recently evaluated clever-cloud.com, and they're working with 10gen/Mongodb.com to provide a MongoDB offering, currently in beta. Their application host pricing is _brilliantly_ competitive, and their auto-scaling infrastructure is pretty awesome.
[15:25:56] <GothAlice> (Currently they operate Tier-3 data centres in Canada and France.)
[15:26:18] <diegoaguilar> they're brazilian...
[15:26:19] <diegoaguilar> ?
[15:26:34] <diegoaguilar> french
[15:26:42] <GothAlice> French. :)
[15:27:40] <PedroDiogo> hm will take a look, thanks
[15:28:43] <GothAlice> I also highly approve of their git-based deployment workflow. (We use this style at work, though we don't use Clever Cloud. Even our VMs themselves have their root filesystem git-managed…)
[15:29:11] <PedroDiogo> what about http://www.lunacloud.com/ ?
[15:30:05] <GothAlice> … the site doesn't inspire confidence. Looks like a cheap whitelabeling of AWS.
[15:30:06] <GothAlice> :/
[15:30:56] <PedroDiogo> its a new Portuguese solution...everybody is talking about them here in Portugal, but I haven't seen any good benchmark
[15:31:08] <diegoaguilar> isnt it spanish
[15:31:13] <diegoaguilar> as spanish from catalunya?
[15:31:33] <PedroDiogo> clever-cloud looks pretty good - its a pitty they do not provide mongodb support yet
[15:31:42] <GothAlice> Clever Cloud is also 61% as expensive for the same 2GB RAM instance. :/
[15:31:47] <GothAlice> PedroDiogo: They do, technically.
[15:31:53] <PedroDiogo> i need something to host my nodejs socket.io + mongodb app
[15:32:18] <GothAlice> PedroDiogo: https://www.clever-cloud.com/doc/addons/clever-cloud-addons/#mongodb
[15:33:33] <GothAlice> I asked them for their initial (and subject to change before release) planned pricing for MongoDB: http://cl.ly/image/3H3I1M0U0w0y
[15:33:44] <StephenLynx> PedroDiogo linode.com
[15:33:57] <GothAlice> Notably, if I tried to host my 27 TiB home dataset with compose.io, it'd cost me half a million USD per month.
[15:34:22] <StephenLynx> plus I suggest taking a look into io.js if you haven't heard about it yet. it's node.js but better and faster.
[15:34:42] <GothAlice> (We use Rackspace, but we need SLAs and things like that and deal with big companies that like the name. Linode is awesome people, too.)
[15:34:50] <PedroDiogo> I'm ok with things like mongolab or compose, but I'm a bit concerned with their performance because of the IO lag...BUT I do like to think I can rely on these type of services to scale and backup my DB as it goes...
[15:34:51] <diegoaguilar> GothAlice, how much are they paying for storing 27TB+
[15:34:53] <diegoaguilar> ?
[15:35:59] <StephenLynx> she stores it at home and uses a back-service.
[15:36:03] <GothAlice> diegoaguilar: It's my personal dataset, hosted three Dell 1U rack-mounts and three Drobo 8-something-i rack-mount iSCSI arrays. It's been growing since 2001, so it's hard for me to estimate the "total cost". (But you could price it out: grab three two-year old servers and three arrays, full of 4TiB WD Green Power drives.)
[15:36:13] <StephenLynx> a back-up service*
[15:36:55] <diegoaguilar> and just curious, what's that data about?
[15:37:37] <GothAlice> diegoaguilar: It's a transparent proxy and ARP MITM that records every bit of digital information I touch across any of my devices or touches my network.
[15:38:15] <diegoaguilar> interesting ...
[15:38:18] <diegoaguilar> Why? :)
[15:38:23] <StephenLynx> hoarding :v
[15:38:43] <GothAlice> (Stored as a GridFS metadata filesystem, primarily organized using bare tags and key/value tags, with synonym association, natural language processing so I don't have to manually tag everything, and some neural network bits.)
[15:39:07] <diegoaguilar> that's a great great stuff
[15:39:12] <diegoaguilar> I mean, like great
[15:39:22] <diegoaguilar> just "personal" project?
[15:39:22] <GothAlice> diegoaguilar: I plan on uploading, so I thought I'd give myself a head start. Also, experiments. This dataset includes a complete copy of Wikipedia, and 350,000 books, as an example. It's a fun playground for AI research.
[15:39:35] <diegoaguilar> yeah yaeh
[15:39:40] <diegoaguilar> you can do a PhD tesis
[15:39:41] <diegoaguilar> on it
[15:39:55] <PedroDiogo> StephenLynx and GothAlice thanks for the linode tip - will take a look
[15:39:55] <GothAlice> I could. Can't release it though. I've found a few hundred patents I'm in violation of. ;)
[15:40:29] <StephenLynx> you could make an anonymous release on some god-forsaken server on a god forsaken country
[15:40:47] <diegoaguilar> GothAlice, is Snowden's cousin
[15:40:59] <StephenLynx> wouldn't it be quite the oposite?
[15:41:08] <GothAlice> Heh. I play by the rules. XP
[15:41:55] <StephenLynx> theres a difference between 'right' and 'lawful'.
[15:43:01] <diegoaguilar> StephenLynx, so, do you use any AI software layer before storing?
[15:43:13] <diegoaguilar> Hadoop? Storm? Mahout?
[15:43:16] <StephenLynx> wat
[15:43:32] <GothAlice> Though I do try to open-source what bits I can get away with. For example, years before MongoDB had full-text indexing, I added it: https://github.com/marrow/contentment/blob/develop/web/extras/contentment/components/search/model.py#L72-L112 (This is a parallel Okapi BM-25 ranking algorithm, the same as Sphinx, Lucene, or Yahoo, with boolean pre-filtering that avoids the MS patent.)
[15:43:38] <diegoaguilar> sorry, I meant GothAlice
[15:43:41] <StephenLynx> oh
[15:43:41] <GothAlice> (Of course, MongoDB has its own full-text indexing now, so yeah.)
[15:43:55] <GothAlice> diegoaguilar: Python. :)
[15:44:06] <diegoaguilar> what packages do you use?
[15:44:13] <diegoaguilar> sci kit
[15:44:14] <diegoaguilar> ?
[15:44:14] <GothAlice> … Python. ¬_¬
[15:44:18] <StephenLynx> :v
[15:44:36] <StephenLynx> plus is good to keep complex projects with a minimum of dependencies.
[15:44:50] <StephenLynx> specially long term projects.
[15:45:06] <GothAlice> diegoaguilar: Though I do rely on my own packages: https://github.com/marrow/ See: https://youtu.be/M9V1e-rG7VA?t=11m26s "Easy AI with Python", specifically, the "neural networks for data mining" part—this is what got me started with AI in Python.
[15:47:01] <diegoaguilar> interesting
[15:47:14] <diegoaguilar> I will check it out
[15:47:17] <GothAlice> At work we use Jython to have our Python code orchestrate Hadoop, Mahout, and Weka.
[15:47:24] <diegoaguilar> you know I'm starting this data mining project with a friend
[15:48:23] <diegoaguilar> having twitter and instagram as resources
[15:48:31] <diegoaguilar> on music and alcohol, are they related, when any general topic on music is popular, is activity on music increasing?
[15:49:01] <GothAlice> Oh, trend prediction. Hope your calculus is fresh. ;)
[15:49:29] <diegoaguilar> so last question to be answered should be "would a rock style advertising work with beers, whiskey or anything else in somewhere"
[15:50:08] <diegoaguilar> we will use mongodb
[15:50:27] <diegoaguilar> but I need to do a simple research on timeseries querying with mongo
[15:50:46] <diegoaguilar> like "has activity on rock increased in the last 2 hours"?
[15:50:57] <GothAlice> We find MongoDB to be the best at streaming live analytics. (Using pre-aggregation.) For our NLP/ranking/AI-type-stuff we pop into Java and more dedicated tools.
[15:51:35] <GothAlice> In a similar way, I wouldn't attempt to model a graph in MongoDB, there are better tools for the job. (And mongo-connector to keep the datasets in sync.)
[15:51:59] <diegoaguilar> yeah, I want to try out either Neo4J, Riak or Allegro
[15:52:11] <diegoaguilar> when its about relationships
[15:52:12] <GothAlice> (Though we haven't been using that, since our Java code has direct execution access to our Python code, so it's all one system.)
[15:52:29] <diegoaguilar> what about Jython performance?
[15:53:11] <GothAlice> Jython's performance is "acceptable". Technically it's just compiling Python code for the JVM, with some glue in there to work around the differences.
[15:53:29] <GothAlice> JVM does threading better, but dynamic types on the Python-side slow things down a bit.
[15:53:50] <diegoaguilar> that's what youre using at work?
[15:54:01] <GothAlice> For the "compare a person vs. a job" code, yes.
[15:54:04] <diegoaguilar> for your personal hoarding project u only do python
[15:54:06] <diegoaguilar> bare python
[15:54:08] <GothAlice> Correct.
[15:54:22] <diegoaguilar> where u do work? :)
[15:54:27] <GothAlice> Well, it's bare Python code running under Pypy. (JIT compilers are nice.)
[15:54:54] <GothAlice> diegoaguilar: Illico Hodes, an HR consulting firm and job distribution platform provider. :)
[15:55:22] <diegoaguilar> je pense tu habites a la france
[15:55:45] <GothAlice> diegoaguilar: Close. ;) J'habite à Montréal, au Québec.
[15:55:52] <diegoaguilar> oh
[15:56:10] <diegoaguilar> I was studying french for year and a half
[15:56:15] <diegoaguilar> but stopped it :(
[15:56:21] <diegoaguilar> would love to speak another language
[15:56:29] <diegoaguilar> only spanish and english :/
[15:56:56] <GothAlice> French is mostly useful, in a world-wide scale, for international treaties and international contracts.
[15:57:03] <diegoaguilar> yeah
[15:57:10] <diegoaguilar> in my city, in my univeristy
[15:57:18] <diegoaguilar> they're crazy about german
[15:57:24] <GothAlice> Even in Canada, which is hypothetically bilingual, French is mostly relegated to the province of Québec.
[15:57:43] <GothAlice> Many engineers studying at your uni?
[15:57:43] <GothAlice> ;)
[15:57:50] <Derick> diegoaguilar: France
[15:57:51] <Derick> :)
[15:57:58] <diegoaguilar> and so Cameroon
[15:58:00] <diegoaguilar> Senegal
[15:58:01] <Derick> large parts of africa
[15:58:07] <GothAlice> Parts of Africa (along with Dutch and various other languages held over from colonization), etc.
[15:58:10] <diegoaguilar> but I mean, where in Canada
[15:58:19] <Derick> diegoaguilar: prince eduoard island?
[15:58:26] <GothAlice> Mostly Québec. Government workers anywhere in Canada need to be fluent in both, though.
[15:58:33] <diegoaguilar> I know world history and geography :P
[15:58:45] <diegoaguilar> I guess spanish is more spoken than franch all along Canada
[15:58:48] <diegoaguilar> :D
[15:59:24] <GothAlice> diegoaguilar: That's mostly a USA thing. Canada is very… diverse. Vancouver is largely chinese, Richmond is largely middle-eastern, etc., etc. (Examples from BC, where I grew up.)
[16:00:05] <diegoaguilar> my father in law is willing to work in Canada
[16:00:14] <diegoaguilar> he's so avid now on learning english
[16:00:24] <GothAlice> (So it's usually <family language> + English, wherever you go, here.)
[16:02:02] <GothAlice> Lojban is an awesome langauge.
[16:02:37] <diegoaguilar> I dont even know what Lojban is
[16:02:48] <diegoaguilar> where is that spoken
[16:03:45] <GothAlice> It's like Esperanto, so there are few "natural born speakers".
[16:03:58] <GothAlice> (lojban.org if you're interested)
[16:04:14] <GothAlice> (http://lojban.github.io/cll/ I find to be the easiest way to read the reference grammer.)
[16:05:03] <diegoaguilar> GothAlice, do you speak english, french and ...
[16:05:03] <diegoaguilar> ?
[16:05:32] <GothAlice> Japanese, and a tiny bit of German and Dutch. (My mother's Dutch. ;)
[16:05:50] <GothAlice> Nihongo wa sugoi desu ne. ;) (Don't have my IME set up at the moment.)
[16:06:17] <Derick> ah, nog iemand die nederlands spreekt
[16:06:45] <diegoaguilar> how did you lean japanese
[16:07:11] <GothAlice> diegoaguilar: My highschool had a two-year elective for it. Hallier-sensei was awesome.
[16:07:43] <diegoaguilar> hehe
[16:07:45] <GothAlice> Japanese is my second favourite language, after Lojban. (For many of the same reasons I like Lojban, actually. It's a pretty grammatically regular language.)
[16:07:57] <diegoaguilar> as in regexes? :P
[16:08:12] <diegoaguilar> ever met with ljoban speakers?
[16:08:29] <GothAlice> Well, let's just say Lojban's grammar is completely unambiguous and wholly defined by a FLEX/YACC grammar, so it's 100% machine parseable, too.
[16:09:01] <GothAlice> For examlpe: http://www.lojban.org/camxes/?text=le%20mi%20varkiclaflo%27i%20cu%20culno%20lo%20angila (my hovercraft is full of eels)
[16:09:14] <GothAlice> Example, even.
[16:10:14] <GothAlice> Haven't met lojban speakers, no. Hang with them on IRC, yes. ;)
[16:11:18] <diegoaguilar> in my own opinion I wouldnt learn either esperanto or ljoban or something
[16:11:19] <diegoaguilar> :)
[16:11:30] <diegoaguilar> I would prefer to know a lot of world spoken languages
[16:12:42] <GothAlice> My uni experience was psychology, with my "big project" being on the Sapir-Whorf Hypothesis (how the language you are taught might determine the range of thoughts you are capable of experiencing). Lojban is a direct way to test the hypothesis.
[16:13:11] <diegoaguilar> interesting
[16:13:45] <diegoaguilar> so you studied ... Computer Sciences ?
[16:13:47] <diegoaguilar> Psychology?
[16:14:03] <GothAlice> Psychology. The professor teaching the CS courses was a former employee of mine… so yeah.
[16:14:30] <diegoaguilar> you're a psychologists?
[16:14:38] <GothAlice> As one really neat example: "that's a pretty little girls school". (apostophe on "girls" omitted intentionally, pretend it's not needed in this instance) — what exactly did I mean by that? Is it a school for pretty little girls? Is it a pretty school, for little girls? Is it a little school for pretty girls? Etc.
[16:15:16] <diegoaguilar> hehe
[16:15:17] <GothAlice> (In Lojban, there is no ambiguous way to represent this statement. It's _always_ one of the specific forms.)
[16:16:22] <diegoaguilar> so studied high school then psychology at university
[16:16:29] <diegoaguilar> and ended up being a software engineer
[16:16:37] <diegoaguilar> working jython, neuronal networks and mongodb
[16:16:38] <diegoaguilar> :P
[16:17:13] <diegoaguilar> ever practiced psychology clinic? :P
[16:17:25] <diegoaguilar> like therapy, gestalt ...
[16:17:38] <GothAlice> Ha, no. Funnily enough I don't, in general, like people. XP
[16:17:51] <GothAlice> The irony was palpable during lectures. XP
[16:17:58] <diegoaguilar> LOL
[16:18:10] <diegoaguilar> lol as in LOL :)
[16:18:27] <diegoaguilar> you studied that in order to try to like people?
[16:18:41] <GothAlice> (I focused on the biology of psychology as an interest in the eventual goal of uploading.)
[16:19:05] <diegoaguilar> my mother is a psychologist
[16:19:33] <diegoaguilar> just did a mater degree in humanism and gestalt psychotherapy
[16:19:50] <GothAlice> Nice.
[16:20:00] <diegoaguilar> I heard a lot about psychology in my life :P
[16:21:17] <GothAlice> These days I apply my knowledge of the nervous system to software systems. (I'm going the top-down functional / block diagram approach to creating an ACI instead of the physical simulation bottom-up approach.)
[16:21:56] <diegoaguilar> well I guess you have a great mind
[16:22:17] <GothAlice> Stackless Python (which Pypy supports a subset of) is awesome for modelling processing pipelines in a similar way to the functional data processing in our brain. Parallel everything FTW.
[16:22:31] <diegoaguilar> but really, wre u doing computer science stuff while in university?
[16:22:40] <diegoaguilar> u did a research?
[16:23:05] <GothAlice> Well, I had already been a developer for 8 years by the time I went to uni.
[16:24:07] <GothAlice> So, beyond the fact that I knew I knew more than the IT professor, I didn't really see the point of the expenditure just to get a bit of paper. ;)
[16:24:25] <diegoaguilar> its interesting
[16:24:46] <diegoaguilar> I wouldnt study anything apart than computer sciences
[16:24:50] <diegoaguilar> Im a bit sick of university
[16:25:07] <GothAlice> Just not under the auspices of an institution built for the purpose. ;)
[16:52:26] <Petazz> Is it possible to pass on some data in a group that is not operated?
[16:53:14] <Petazz> Let's say I have a blog and I want to know as a time series how many distinct users have made a comment as a total sum
[16:53:37] <Petazz> So first day 1 distinct person had commented, second day 3 in total etc
[16:54:20] <GothAlice> Petazz: For my forums that's done using a simple aggregate query. I range select the modification time of the threads, unwind on replies, project just the author field, then group on that author field while $sum 1'ing.
[16:58:07] <Petazz> GothAlice: Hmm cool, what I've been missing is the unwind I guess..
[16:58:41] <GothAlice> Notably, that's one of the few statistics I don't pre-aggregate.
[16:59:37] <GothAlice> The number of replies, number of upvotes, etc. are stored against each reply, but at the Thread level there's a pre-aggregated version of all of those stats so I can more efficiently get per-thread stats grouped in various ways without needing to reprocess (unwind) every single reply each time.
[17:00:17] <lmofi> Using MongoDB now i get data of the right row but how do i delete these rows channelTB.messages.forEach(function(v){ console.log(v) }); v output is like {data: data} {data: data} {data: data}
[17:13:31] <blizzow> I installed a copy of 3.0 and added a root user. I also commented out bind_ip=127.0.0.1 in my mongod.conf and restarted mongo services. I'm still only able to authenticate from the host that mongo is running on. Is there something else I have to do to allow remote authentication?
[17:14:00] <GothAlice> bind_ip=127.0.0.1,<external IP>
[17:14:10] <GothAlice> Or, if you want to throw caution to the wind: bind_ip=0.0.0.0
[17:27:16] <Petazz> GothAlice: In your way of unwinding the replies, you miss out the timestamp though?
[17:27:37] <GothAlice> Depends on what you project, and how you group.
[17:27:38] <blizzow> GothAlice: I tried giving bind_ip=0.0.0.0 and I'm still unable to authenticate from remote hosts.
[17:28:39] <GothAlice> blizzow: Are you at least able to connect to the host remotely? Telnet to the MongoDB port on it? (27017 I believe.) Ping it?
[17:29:08] <blizzow> I'm able to telnet to 27017 for sure.
[17:29:38] <blizzow> It gets far enough through the conversation to give an auth failed (Error #18)
[17:30:28] <GothAlice> blizzow: See http://docs.mongodb.org/v3.0/release-notes/3.0-compatibility/ and the sections on authentication/security changes.
[17:30:56] <GothAlice> blizzow: Also: https://jira.mongodb.org/browse/SERVER-17459
[17:31:53] <lmofi> Using MongoDB now i get data of the right row but how do i delete these rows channelTB.messages.forEach(function(v){ console.log(v) }); v output is like {data: data} {data: data} {data: data}
[17:40:07] <srajbr> in case of migrating ranges of data from one shard to another does _id changes?
[17:41:36] <Derick> srajbr: no, _id is immutable
[17:43:16] <srajbr> Derick thanks. I was confused as _id Object generation follow some rule dependent on host
[17:43:45] <Derick> yes, but they are created on the *client* host
[17:43:47] <Derick> not on the server
[17:45:51] <srajbr> on client!!!!
[17:46:18] <srajbr> doest it follow: timestamp, machine, pid, increment
[17:46:57] <GothAlice> srajbr: Yes, but that data is collected and generated in the client application, not the server.
[17:47:16] <GothAlice> machine = app server, PID = application process ID, increment = record generated in that process, starting at a random value.
[17:48:20] <srajbr> ok "app server" = "mongodb-client"
[17:49:32] <GothAlice> One example of a potential client, yes.
[17:50:02] <GothAlice> Task workers, if you use distributed tasks, would be another example of a mongodb client. (They're not webapps, but they're still clients.)
[17:53:14] <srajbr> Thanks GothAlice for the explanation :)
[17:56:02] <lmofi> Using MongoDB now i get data of the right row but how do i delete these rows channelTB.messages.forEach(function(v){ console.log(v) }); v output is like {data: data} {data: data} {data: data}
[17:58:24] <boutell> sounds like 3.0 has a new default bind behavior? (Not a bad idea)
[18:04:20] <daidoji> is it possible to mongoexport a JSON object that won't be mongoimported?
[18:04:46] <daidoji> giving error exception:BSON representation of supplied JSON is too large: code FailedToParse perhaps?
[18:05:48] <Boomtime> daidoji: are you saying you get that error when you try to mongoimport a specific doc?
[18:11:10] <daidoji> Boomtime: no, I'm saying I mongoexport a collection and then mongoimport and it throws that error
[18:11:18] <daidoji> Boomtime: I can see why that error would be thrown at other times
[18:19:56] <GothAlice> Potentially, shorts are being converted to longs or floats, since all numbers in JSON are floats. This may be increasing the storage requirements of one or more records beyond the 16MB limit. This is, however, unlikely.
[18:20:34] <GothAlice> daidoji: ^
[18:21:31] <daidoji> GothAlice: hmmm, okay I will check to see if thats it.
[18:21:33] <Boomtime> daidoji: can you mongodump and mongorestore the offending doc?
[18:21:54] <Boomtime> if you can, then type conversion is a likely cause
[18:22:28] <Boomtime> what driver do you use? and do you use a lot of ints? say, arrays of int?
[18:23:14] <GothAlice> Arrays of ints, or arrays of arrays of ints, would be the only way I'd think would be feasible to reach the 16MB limit, though. 16MB is a _lot_. Binary data?
[18:23:48] <daidoji> Boomtime: I can but I need the --upsert clause
[18:24:12] <daidoji> I'm not sure what driver, whatever I yum installed today from mongo's yum repo
[18:24:22] <daidoji> GothAlice: no arrays of ints or arrays of arrays or binary data
[18:24:26] <GothAlice> daidoji: What's your application written in?
[18:24:35] <GothAlice> (We're interested in client driver, not server. ;)
[18:24:40] <daidoji> well I was shell scripting this step
[18:25:11] <daidoji> this goes back to my questions the other day about transferring data from my data-warehouse mongo instance into a production mongoinstance
[18:25:20] <daidoji> mongoexport/import seems the fastest
[18:25:53] <GothAlice> That's strange, considering mongodump/restore won't need to deserialize data back into BSON first.
[18:26:10] <daidoji> so its essentially mongoexport -d db -c collection | bzip2 -9 col.bz2; on one end
[18:26:21] <GothAlice> (And can bundle things like oplogs, and index data, further reducing downtime.)
[18:26:27] <daidoji> and bzcat col.bz2 | mongoimport -d db_prod -c collection --upsert on the other
[18:26:38] <daidoji> oh I'm not using mongodump/restore
[18:26:51] <GothAlice> Hmm, the upsert thing.
[18:26:53] <daidoji> can I use that with --upsert like functionality?
[18:28:29] <daidoji> hmmm, wait a second. maybe its not a mongoimport issue... The second time I run it here it looks like I run into a bzip2 error...
[18:28:55] <GothAlice> You're upserting. You may be upserting too much (too many appends, for example) into a single record, and need to change how you're grouping those.
[18:29:32] <daidoji> GothAlice: what do you mean?
[18:30:22] <GothAlice> I.e. if you're grouping your records into five minute time slices (I'm doing analytics at work, so… this is the example that comes to me) but have too many events in a single five-minute period, the record may grow beyond the 16MB limit.
[18:31:05] <GothAlice> To "re-group" this data, I'd change it to two minute time intervals, for example, to reduce the individual number of events per time period (at the expense of 2.5x more records to read when performing aggregate queries.)
[18:32:43] <daidoji> hmmm, these are idempotent records and I think the distribution of upserts shouldnt' be too bad, but I'll check to see what it is
[18:45:30] <daidoji> whoops my mistake, zipped file was corrupted :p
[19:22:38] <greyTEO> is it possible to specify multiple $addToSet in a single update statement?
[19:23:04] <greyTEO> BasicBSONObject.append($addToSet, query) seems to override the each other
[19:41:47] <lmofi> What is the inventory in this code db.inventory.remove( { type : "food" }, 1 )
[19:44:09] <kali> lmofi: the name of a collection
[19:44:30] <lmofi> kali: if i do that it drops the whole collection
[19:44:50] <kali> only the document where type is "food"
[19:44:55] <kali> +s
[19:46:29] <lmofi> kali: i need only delete the one inside the collection
[19:46:35] <lmofi> how do i do that
[19:46:46] <lmofi> so inside collection and document
[20:04:10] <arussel> is there a way to use an index when doing a case insensitive search ?
[20:04:43] <arussel> I'm search /^foo/i and was hoping that this could use my index
[20:05:31] <GothAlice> arussel: Store a normalized-lowercase version of that field, then search lowercase.
[20:05:42] <GothAlice> Prefix searches, that are sensitive, can use the index.
[20:06:45] <arussel> yeah, I know that /^foo/ uses the index and I know the work around, but I was hoping /i could use it to 'out of the box'
[20:11:27] <arussel> with an index on {foo:1}, I was expecting: db.mydb.find({}).sort({foo:1}).skip(500).limit(1) to scan only 1 document. Is my index badly done or did I misunderstood something ?
[20:11:53] <arussel> nscanned is 502
[20:12:48] <lmofi> How do i empty the data inside the collection > document > message
[20:13:07] <arussel> lmofi: $unset ?
[20:13:12] <StephenLynx> update
[20:13:37] <StephenLynx> db.collection.update({query},{update operations})
[20:14:48] <lmofi> collection.update({ $unset: { messages: ""} })
[20:14:51] <lmofi> like that ?
[20:15:17] <StephenLynx> no, you missed the query block
[20:15:24] <StephenLynx> before that
[20:15:33] <StephenLynx> to tell which documents should be updated.
[20:16:08] <StephenLynx> and if you wish to update multiple documents, you need a third block, for options. and then you put multi : true in this third block.
[20:17:14] <lmofi> StephenLynx: the document inside the collection looks like ObjectId("55033asdsa03b193d4319de486")
[20:17:34] <StephenLynx> so you already removed the field?
[20:17:54] <lmofi> no thats the document i want to delete the value inside the document
[20:18:57] <StephenLynx> if the document only has its id, then you can't remove its field.
[20:19:03] <StephenLynx> you could delete it though.
[20:19:41] <lmofi> StephenLynx: but im getting in the document with ChannelTB.findOne({name: {$regex: '^' + channel_name + '$' , $options: 'i' }}).exec(function(err, channelTB){
[20:19:49] <lmofi> so i get the document values
[20:34:42] <hmsimha> Hey, I'm reading about the changes to MongoDB between the 2.4 and 2.6 releases. It seems that a big one that might affect compatibility with the application I work on is returning a cursor for aggregations, instead of the full set. Is there a way to ensure aggregations still request the full set instead of a cursor?
[20:37:40] <medmr1> hello?
[20:37:42] <medmr1> nice.
[20:38:16] <medmr1> i just went from 2.4 to 2.6 finally. pro-tip: don't let the ops guys install 2.6.3, you want 2.6.8
[20:38:40] <medmr1> 2.6.3 performed terribly compared to 2.4 and segfaulted
[20:38:48] <medmr1> 2.6.8 seems to be working smooth
[20:40:33] <hmsimha> ok, thanks medmr1
[20:40:35] <cheeser> pro tip: use mms and let it do that for you :)
[20:40:46] <hmsimha> I am 'the ops guys'
[20:41:04] <hmsimha> cheeser: mms?
[20:41:12] <cheeser> mms.mongodb.com
[20:42:10] <hmsimha> ah, we use compose.io cheeser. The upgrade itself isn't the problem (just select from a dropdown) but I want to make sure it's not going to break anything in the application
[20:44:40] <StephenLynx> why not 3.0 since you are upgrading?
[20:47:49] <hmsimha> More compatibility issues to check StephenLynx. The main reason to upgrade is so I can take advantage of bulk inserts
[20:48:00] <StephenLynx> hm
[21:22:15] <lmofi> Omg can somebody help me with a documten i want to empty up a index inside a document, collection > document >1 index value named messages
[21:31:03] <NoOutlet> You're going to need to make that request considerably more coherent.
[22:47:19] <daidoji> okay new question based on earlier question
[22:47:44] <daidoji> has anyone ever run into an issue where mongoexport | bzip2 always causes a corrupt bzip file?
[22:48:28] <daidoji> hmmm, maybe its not mongoexport...
[23:47:11] <cchristie> i'm trying to get the latest version of mongo installed, but i am having difficulty
[23:47:14] <cchristie> sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
[23:47:51] <cchristie> when i run `sudo apt-get update` i am presented with W: Failed to fetch http://repo.mongodb.org/apt/ubuntu/dists/utopic/mongodb-org/3.0/multiverse/binary-amd64/Packages 404 Not Found
[23:48:09] <cchristie> W: Failed to fetch http://repo.mongodb.org/apt/ubuntu/dists/utopic/mongodb-org/3.0/multiverse/binary-i386/Packages 404 Not Found
[23:58:35] <sahilsk> Greetings.. I just install MMS agent using ubuntu debian file. I can see the agent entry in mms gui.. I click on "install monitoring agent" and "install backup agent" but notning happen .
[23:58:35] <sahilsk> https://gist.github.com/sahilsk/6ca01478d67c7b06616e
[23:58:50] <sahilsk> ^ this is mms log. i'ts stuck in goal state. Any help would be great
[23:59:19] <sahilsk> All 0 Mongo processes are in goal state, Monitoring agent in goal state, Backup agent in goal state